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SPECIFIC A TION 
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CONSENSUS /ANCESTRAL IMMUNOGENS 

This application is related to Provisional 
Application No. 60/503,460, filed September 17, 
2003, the entire content of which is incorporated 
5 herein by reference. 



TECHNICAL FIELD 

The present invention relates, in general, to 
an immunogen and, in particular, to an immunogen for 
inducing antibodies that neutralize a wide spectrum 
10 of HIV primary isolates and/or to an immunogen that 
^induces a T cell immune response. The invention 
also relates to a method of inducing anti-HIV 
antibodies, and/or to a method of inducing a T cell 
immune response, using such an immunogen. The 
15 invention further relates to nucleic acid sequences 
encoding the present immunogens . 

BACKGROUND 

The high level of genetic variability of HIV-1 
has presented a major hurdle for AIDS vaccine 
development- Genetic differences among HIV-1 groups 
M, N, and O are extensive, ranging from 3 0% to 50% 
in gragr and env genes, respectively (Gurtler et al , 
*J. Virol. 68:1581-1585 (1994), Vanden Haesevelde et 
al, J. Virol. 68:1586-1596 (1994), Simon et al, Nat. 
Med. 4:1032-1037 (1998), Kuiken et al, Human 
retroviruses and AIDS 2000: a compilation and 
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analysis of nucleic acid and amino acid sequences 

(Theoretical Biology and Biophysics Group, Los 
Alamos National Laboratory, Los Alamos, New 
Mexico) ) . Viruses within group M are further 
5 classified into nine genetically distinct subtypes 

(A-D, F-H, J and K) (Kuiken et al , Human 
retroviruses and AIDS 2000: a compilation and 
analysis of nucleic acid and amino acid sequences 

(Theoretical Biology and Biophysics Group, Los 
10 Alamos National Laboratory, Los Alamos, New Mexico, 

Robertson et al , Science 288:55-56 (2000), Robertson 
et al, Human retroviruses and AIDS 1999: a 
compilation and analysis of nucleic acid and amino 
acid sequences , eds . Kuiken et al (Theoretical 
15 Biology and Biophysics Group, Los Alamos National 
Laboratory, Los Alamos, New Mexico) , pp. 492-505 

(2000) ) . With the genetic variation as high as 30% 
in env genes among HIV-1 subtypes, it has been 
difficult to consistently elicit cross-subtype T and 
20 B cell immune responses against all HIV-1 subtypes. 
HIV-1 also frequently recombines among different 
subtypes to create circulating recombinant forms 

(CRFs) (Robertson et al, Science 288:55-56 (2000), 
Robertson et al , Human retroviruses and AIDS 1999: a 
25 compilation and analysis of nucleic acid and amino 
acid sequences, eds. Kuiken et al (Theoretical 
Biology and Biophysics Group, Los Alamos National 
Laboratory, Los Alamos, New Mexico), pp. 492-505 

(2000) , Carr et al, Human retroviruses and AIDS 
30 1998: a compilation and analysis of nucleic acid and 
amino acid sequences, eds. Korber et al (Theoretical 
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Biology and Biophysics Group, Los Alamos National 
Laboratory, Los Alamos, New Mexico), pp. III-10-III- 
19 (1998)), Over 20% of HIV-1 isolates are 
recombinant in geographic areas where multiple 
5 subtypes are common (Robertson et al, Nature 

374:124-126 (1995), Cornelissen et al , *J. virol . 
70:8209-8212 (1996), Dowling et al, AIDS 16:1809- 
1820 (2002)), and high prevalence rates of 
recombinant viruses may further complicate the 

10 design of experimental HIV-1 immunogens. 

To overcome these challenges in AIDS vaccine 
development, three computer models (consensus, 
ancestor and center of the tree) have been used to 
generate centralized HIV-1 genes to (Gaschen et al , 

15 Science 296:2354-2360 (2002), Gao et al , Science 
299:1517-1518 (2003), Nickle et al , Science 
299:1515-1517 (2003), Novitsky et al , J. Virol. 
76:5435-5451 (2002), Ellenberger et al , Virology 
302:155-163 (2002), Korber et al , Science 288:1789- 

20 1796 (2000) ) . The biology of HIV gives rise to 

star-like phylogenies, and as a consequence of this, 
the three kinds of sequences differ from each other 
by 2 - 5% (Gao et al , Science 299:1517-1518 (2003)). 
Any of the three centralized gene strategies will 

25 reduce the protein distances between immunogens and 
field virus strains. Consensus sequences minimize 
the degree of sequence dissimilarity between a 
vaccine strain and contemporary circulating viruses 
by creating artificial sequences based on the most 

30 common amino acid in each position in an alignment 
(Gaschen et al, Science 296:2354-2360 (2002)). 
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Ancestral sequences are similar to consensus 
sequences but are generated using maximum- likelihood 
phylogenetic analysis methods (Gaschen et al , 
Science 296:2354-2360 (2002), Nickle et al , Science 
5 299:1515-1517 (2003)) . In doing so, this method 
recreates the hypothetical ancestral genes of the 
analyzed current wild- type sequences (Figure 26) . 
Nickle et al proposed another method to generate 
centralized HIV-l sequences, center of the tree 
10 (COT) , that is similar to ancestral sequences but 
less influenced by outliers (Science 299:1515-1517 
(2003) ) . 

The present invention results, at least in 
part, from the results of studies designed to 

is determine if centralized immunogens can induce both 
T and B cell immune responses in animals. These 
studies involved the generation of an artificial 
group M consensus env gene (CON6) , and construction 
of DNA plasmids and recombinant vaccinia viruses to 

2 0 express CON6 envelopes as soluble gp!2 0 and gpl4 0CF 
proteins. The results demonstrate that CON6 Env 
proteins are biologically functional, possess 
linear, conformational and glycan- dependent epitopes 
of wild-type HIV-l, and induce cytokine -producing T 

25 cells that recognize T cell epitopes of both HIV 
subtypes B and C. Importantly, CON6 gpl2 0 and 
gpl40CF proteins induce antibodies that neutralize 
subsets of subtype B and C HIV-l primary isolates. 
The iterative nature of study "of the 

30 centralized HIV-l gene approach is derived from the 
rapidly expanding evolution of HIV- 1 sequences, and 
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the fact that sequences collected in the HIV 
sequence database (that is, the Los Alamos National 
Database) are continually being updated with new 
sequences each year. The CON6 gpl2 0 envelope gene 
5 derives from Year 1999 Los Alamos National Database 
sequences, and Con-S derives from Year 2000 Los 
Alamos National Database sequences. In addition, 
CON6 has Chinese subtype C VI, V2 , V4 , and V5 Env 
sequences, while Con-S has all group M consensus Env 

10 constant and variable regions, that have been 

shortened to minimal -length variable loops. Codon- 
optimized genes for a series of Year 2003 group M 
and subtype consensus sequences have been designed, 
as have a corresponding series of wild- type HIV-1 

is Env genes for comparison, for use in inducing 

broadly reactive T and B cell responses to HIV-1 
primary isolates. 

SUMMARY OF THE INVENTION 

The present invention relates to an immunogen 
20 for inducing antibodies that neutralize a wide 
spectrum of HIV primary isolates and/or to an 
immunogen that induces a T cell immune response, and 
to nucleic acid sequences encoding same. The 
invention also relates to a method of inducing anti- 
25 HIV antibodies, and/or to a method of inducing a T 
cell immune response, using such an immunogen. 

Objects and advantages of the present invention 
will be clear from the description that follows. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A-1D: Generation and expression of 
the group M consensus env gene (CON6) . The complete 
amino acid sequence of CON6 gp!60 is shown. 
5 (Fig. 1A) The five regions from the wild- type 
CRF08_BC (98CN006) env gene are indicated by 
underlined letters. Variable regions are indicated 
by brackets above the sequences. Potential N-liked 
glycosylation sites are highlighted with bold-faced 

10 letters. (Fig. IB) Constructs of CON6 gpl20 and 
gpl40CF. CON6 gpl20 and gpl40CF plasmids were 
engineered by introducing a stop codon after the 
gpl2 0 cleavage site or before the transmembrane 
domain, respectively. The gpl20/gp41 cleavage site 

15 and fusion domain of gp41 were deleted in the 

gpl40CF protein. (Fig.lC) Expression of CON6 gpl20 
and gpl40CF. CON6 gp!20 and gpl40CF were purified 
from the cell culture supernatants of rW- infected 
293T cells with galanthus Nivalis argarose lectin 

20 columns. Both gpl20 and gpl40CF were separated on a 
10% SDS-polyaryl amide gel and stained with Commassie 
blue. (Fig. ID.) CON6 env gene optimized based on 
codon usage for highly expressed human genes . 

Figures 2A-2E. Binding of CON6 gpl2 0 gpl4 0 CF 
25 to soluble CD4 (sCD4) and ant i -Env mAbs . (Figs. 2A- 
2B) Each of the indicated mabs and sCD4 was 
covalently immobilized to a CMS sensor chip 
(BIAcore) and CON6 gpl20 (Fig. 2A) or gp!40CF (Fig. 
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2B) (100 /-ig/ml and 3 00 ptg/ml , respectively) were 
injected over each surface. Both gpl2 0 and gpl4 0CF 
proteins reacted with each anti-gpl2 0 mabs tested 
except forl7b mab, which showed negligible binding 
5 to both CON6 gpl2 0 and gpl4 0CF. To determine 
induction of 17b mab binding to CON6 gpl2 0 and 
gpl40CF, CON6 gpl20 (Fig. 2C) or gpl40CF (Fig. 2D) 
proteins were captured (400-580 RU) on individual 
flow cells immobilized with sCD4 or mabs A32 or T8 . 

10 Following stabilization of each of the surface, mAb 
17b was injected and flowed over each of the 
immobilized flow cells. Overlay of curves show that 
the binding of mab 17b to CON6 Env proteins was 
markedly enhanced on both sCD4 and mab A32 surfaces 

15 but not on the T8 surface (Figs. 2C-2D) . To 

determine binding of CON6 gpl20 and gpl4 0CF to human 
mabs in ELISA, stock solutions of 20|jg/ml of mabs 
447, F39F, A32, IgGlbl2 and 2F5 on CON6 gpl20 and 
gpl40CF were tittered (Fig. 2E) . Mabs 447 (V3) , 

20 F3 9F (V3) A32 (gpl2 0) and IgGlbl2 (CD4 binding site) 
each bound to both CON6 gpl20 and 140 well, while 
2F5 (anti-gp41 ELDKWAS ) only bound gpl40CF. The 
concentration at endpoint titer on gpl20 for mab 447 
and F39F binding was <0.003 /X9/™1 and 0.006 /xg/ml, 

25 respectively; for mab A32 was <0.125 /ig/ml ,- for 
IgGlbl2 was <0.002 pg/ml ; and for 2F5 was 0.016 
Mg/ml . 

Figures 3A and 3B. Infectivity and coreceptor 
usage of CON6 envelope. (Fig. 3A) CON6 and control 
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env plasmids were cotransf ected with HIV- 1/SG3 Aenv 
backbone into human 2 93T cells to generate Env- 
pseudovirions. Equal amounts of each pseudovirion 
(5 ng p24) were used to infect JC53-BL cells. The 
5 infectivity was determined by counting the number of 
blue cells (infectious units, IU) per microgram of 
p24 of pseudovirons (IU/M9 p24) after staining the 
infected cells for P-gal expression. (Fig. 3B) 
Coreceptor usage of the CON6 env gene was determined 

10 on JC53BL cells treated with AMD3100 and/or TAK-799 
for 1 hr (37°C) then infected with equal amounts of 
p24 (5 ng) of each Env-pseudovirion . Infectivity in 
the control group (no blocking agent) was set as 
100%. Blocking efficiency was expressed as the 

is percentage of IU from blocking experiments compared 
to those from control cultures without blocking 
agents. Data shown are mean + SD. 

Figure 4. Western blot analysis of multiple 
subtype Env proteins against multiple subtype 

20 antisera. Equal amount of Env proteins (100 ng) 
were separated on 10% SDS-polyacrylamide gels. 
Following electrophoresis, proteins were transferred 
to Hybond ECL nitrocellulose membranes and reacted 
with sera from HIV-1 infected patients (1:1,000) or 

25 guinea pigs immunized with CON6 gpl20 DNA prime, rW 
boost (1:1,000) . Protein-bound antibody was probed 
with fluorescent -labeled secondary antibodies and 
the images scanned and recorded on an infrared 
imager Odyssey (Li -Cor, Lincoln, NE) . Subtypes are 
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indicated by single-letters after Env protein and 
serum IDs . Four to six sera were tested for each 
subtype, and reaction patterns were similar among 
all sera from the same subtype. One representative 
5 result for each subtype serum is shown. 

Figure 5. T cell immune responses induced by 
CON6 Env immunogens in mice. Splenocytes were 
isolated from individual immunized mice (5 
mice/group) . After splenocytes were stimulated in 
vitro with overlapping Env peptide pools of CON6 
(black column) , subtype B (hatched column) , subtype 
C (white column) , and medium (no peptide; gray 
column) , INF-y producing cells were determined by 
the ELISPOT assay. T cell IFN-y responses induced 
by either CON6 gpl2 0 or gpl4 0CF were compared to 
those induced by subtype specific Env immunogens 
(JRFL and 96ZM651) . Total responses for each 
envelope peptide pool are expressed as SFCs per 
million splenocytes. The values for each column are 
the mean ± SEM of IFN-y SFCs (n=5 mice/group) . 

Figures 6A-6E . Construction of codon usage 
optimized subtype C ancestral and consensus envelope 
genes (Figs. 6A and 6B, respectively) . Ancestral 
and consensus amino acid sequences (Figs. 6C and 6D, 
25 respectively) were transcribed to mirror the codon 
usage of highly expressed human genes . Paired 
oligonucleotides (80-mers) overlapping by 20 bp were 
designed to contain 5' invariant sequences including 
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the restriction enzyme sites EcoRI , Bbsl, Bam HI and 
BsmBI. Bbsl and BsmBI are Type II restriction 
enzymes that cleave outside of their recognition 
sequences. Paired oligomers were linked 
5 individually using PCR and primers complimentary to 
the 18 bp invariant sequences in a stepwise fashion, 
yielding 14 0bp PCR products. These were subcloned 
into pGEM-T and sequenced to confirm the absence of 
inadvertant mutations/deletions. Four individual 

io pGEM-T subclones containing the proper inserts were 
digested and ligated together into pcDNA3.1. Multi- 
fragment ligations occurred repeat ly amongst groups 
of fragments in a stepwise manner from the 5' to the 
3' end of the gene until the entire gene was 

is reconstructed in pcDNA3 . 1 . (See schematic in Fig. 
6E. ) 

Figure 7. JC53-BL cells are a derivative of 
HeLa cells that express high levels of CD4 and the 
HIV-1 coreceptors CCR5 and CXCR4 . They also contain 

20 the reporter cassettes of luciferase and p- 

galactosidase that are each expressed from an HIV-1 
LTR. Expression of the reporter genes is dependent 
on production of HIV-1 Tat. Briefly, cells are 
seeded into 24 or 96-well plates, incubated at 37 °C 

25 for 24 hours and treated with DEAE - Dext ran at 3 7 °C 
for 30 minutes. Virus is serially diluted in 1% 
DMEM, added to the cells incubating in DEAE - Dext ran , 
and allowed to incubate for 3 hours at 3 7 °C after 
which an additional cell media is added to each 
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well. Following a final 48-hour incubation at 37°C, 
cells are either fixed, stained using X-Gal to 
visualize P-galactosidase expressing blue foci or 
frozen- thawed three times to measure luciferase 
5 activity. 

Figure 8. Sequence alignment of subtype C 
ancestral and consensus env genes. Alignment of the 
subtype C ancestral (bottom line) and consensus (top 
line) env sequences showing a 95.5% sequence 
homology; amino acid sequence differences are 
indicated. One noted difference is the addition of a 
glycosylation site in the C ancestral env gene at 
the base of the VI loop. A plus sign indicates a 
within-class difference of amino acid at the 
indicated position; a bar indicates a change in the 
class of amino acid. Potential N-glycosylation sites 
are marked in blue. The position of truncation for 
the gpl4 0 gene is also shown. 

Figure 9 . Expression of subtype C ancestral 
20 and consensus envelopes in 293T cells. Plasmids 
containing codon- optimized gpl60, gpl40, or gpl20 
subtype C ancestral and consensus genes were 
transfected into 293T cells, and protein expression 
was examined by Western Blot analysis of cell 
25 lysates. 48-hours post- transf ection, cell lysates 
were collected, total protein content determined by 
the BCA protein assay, and 2 |ig of total protein was 
loaded per lane on a 4-20% SDS-PAGE gel. Proteins 
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were transferred to a PVDF membrane and probed with 
HIV-1 plasma from a subtype C infected patient. 

Figures 10A and 10B. Fig, 10A. Trans 
complementation of env-def icient HIV-1 with codon- 
5 optimized subtype C ancestral and consensus gpl60 
and gpl40. Plasmids containing codon-optimized, 
subtype C ancestral or consensus gplSO or gpl40 
genes were co- transf ected into 2 93T cells with an 
HIV-l/SG3Aei3v provinas. 48 hours post- transf ect ion 

10 cell supernatants containing pseudotyped virus were 
harvested, clarified by centrif ugation, filtered 
through at 0.2^iM filter, and pelleted through a 20% 
sucrose cushion. Quantification of p24 in each 
virus pellet was determined using the Coulter HIV-1 

15 p24 antigen assay; 25ng of p24 was loaded per lane 
on a 4-20% SDS-PAGE gel for particles containing a 
codon-optimized envelope. 250ng of p24 was loaded 
per lane for particles generated by co-transfection 
of a rev-dependent wild- type subtype C 9 6 Z AM 651 en v 

20 gene. Differences in the amount of p24 loaded per 
lane were necessary to ensure visualization of the 
rev-dependent envelopes by Western Blot. Proteins 
were transferred to a PVDF membrane and probed with 
pooled plasma from HIV-1 subtype B and subtype C 

25 infected individuals. Fig. 10B. Infect ivity of 

virus particles containing subtype C ancestral and 
consensus envelope glycoproteins. Infectivity of 
pseudotyped virus containing ancestral or consensus 
gpl60 or gpl40 envelope was determined using the 
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JC53-BL assay. Sucrose cushion purified virus 
particles were assayed by the Coulter p24 antigen 
assay, and 5-fold serial dilutions of each pellet 
were incubated with DEAE - Dext ran treated JC53-BIj 
5 cells. Following a 48-hour incubation period, cells 
were fixed and stained to visualize P-galactosidase 
expressing cells. Infectivity is represented as 
infectious units per ng of p24 to normalize for 
differences in the concentration of the input 
10 pseudovirions. 

Figure 11. Co-receptor usage of subtype C 
ancestral and consensus envelopes . Pseudotyped 
particles containing ancestral or consensus envelope 
were incubated with DEAE - Dext ran treated JC53-BL 

is cells in the presence of AMD3100 (a specific 

inhibitor of CXCR4 ) , TAK779 (a specific inhibitor of 
CCR5) , or AMD3000+TAK779 to determine co-receptor 
usage. NL4.3, an isolate known to utilize CXCR4 , 
and YU-2, a known CCR5-using isolate, were included 

20 as controls. 

Figures 12A-12C. Neutralization sensitivity of 
subtype C ancestral and consensus envelope 
glycoproteins. Equivalent amounts of pseudovirions 
containing the ancestral, consensus or 96ZAM651 
25 gpl60 envelopes (1,500 infectious units) were pre- 
incubated with a panel of plasma samples from HIV-1 
subtype C infected patients and then added to the 
JC53-BL cell monolayer in 96 -well plates. Plates 
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were cultured for two days and luciferase activity- 
was measured as an indicator of viral infectivity. 
Virus infectivity is calculated by dividing the 
luciferase units (LU) produced at each concentration 
5 of antibody by the LU produced by the control 

infection. The mean 50% inhibitory concentration 
(IC 50 ) and the actual % neutralization at each 
antibody dilution are then calculated for each 
virus. The results of all luciferase experiments 
10 are confirmed by direct counting of blue foci in 
parallel infections . 



Figures 13A-13F. Protein expression of 
consensus subtype C Gag (Fig. 13A) and Nef (Fig. 
13B) following transfection into 293T cells. 
15 Consensus subtype C Gag and Nef amino acid sequences 
are set forth in Figs. 13C and 13D, respectively, 
and encoding sequences are set forth in Figs. 13E 
and 13F, respectively. 

Figures 14A-14C. Figs. 14A and 14B show the 
20 Con-S Env amino acid sequence and encoding sequence, 
respectively. Fig. 14C shows expression of Group M 
consensus Con-S Env proteins using an in vitro 
transcription and translation system. 



Figures 15A and 15B. Expression of Con-S env 
25 gene in mammalian cells. (Fig. ISA - cell lysate, 
Fig. 15B - supernatant.) 
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Figures 16A and 16B. Infectivity (Fig. 16A) 
and coreceptor usage (Fig. 16B) of CON6 and Con-S 
env genes . 

Figures 17A-17C. Env protein incorporation in 
5 CON6 and Con-S Env-pseudovirions . (Fig. 17A - 

lysate, Fig. 17B - supernatant, Fig. 17C pellet.) 

Figures 18A-18D. Figs. 18A and 18B show 
subtype A consensus Env amino acid sequence and 
nucleic acid sequence encoding same, respectively. 
10 Figs. 18C and 18D show expression of A. con env gene 
in mammalian cells (Fig. 18C - cell lysate, Fig. 18D 
- supernatant) . 

Figures 19A-19H. M. con. gag (Fig. 19A) , 
M.con.pol (Fig. 19B) , M.con.nef (Fig. 19C) and 
15 C.con.pol (Fig. 19D) nucleic acid sequences and 

corresponding encoded amino acid sequences (Figs. 
19E-19H, respectively) . 

Figures 20A-20D. Subtype B consensus gag (Fig. 
20A) and env (Fig.20B) genes. Corresponding amino 
20 acid sequences are shown in Figs. 2 0C and 2 0D. 

Figure 21. Expression of subtype B consensus 
env and gag genes in 2 93T cells. Plasmids 
containing codon- optimized subtype B consensus 
gpl60, gpl40, and gag genes were transfected into 
25 2 93T cells, and protein expression was examined by 
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Western Blot analysis of cell lysates. 4 8-hours 
post-transf ection, cell lysates were collected, 
total protein content determined by the BCA protein 
assay, and 2 /zg of total protein was loaded per lane 
5 on a 4-20% SDS-PAGE gel. Proteins were transferred 
to a PVDF membrane and probed with serum from an 
HIV-1 subtype B infected individual. 

Figure 22. Co-receptor usage of subtype B 
consensus envelopes- Pseudotyped particles 
containing the subtype B consensus gpl60 Env were 
incubated with DEAE-Dextran treated CTC53-BL cells in 
the presence of AMD3100 (a specific inhibitor of 
CXCR4) , TAK779 (a specific inhibitor of CCR5) , and 
AMD3 00 0+TAK779 to determine co-receptor usage. 
NL4.3, an isolate known to utilize CXCR4 and YU-2, a 
known CCR5-using isolate, were included as controls. 

Figures 23A and 2 3B. Trans complementation of 
env-def icient HIV-1 with codon-optimized subtype B 
consensus gp!60 and gp!40 genes. Plasmids 
20 containing codon-optimized, subtype B consensus 

gpl60 or gpl40 genes were co- transf ected into 2 93T 
cells with an HIV- l/SG3Aenv provirus. 48-hours 
post-transf ection cell supernatants containing 
pseudotyped virus were harvested, clarified in a 
25 tabletop centrifuge, filtered through a 0 . 2fiM 

filter, and pellet through a 20% sucrose cushion. 
Quantification of p24 in each virus pellet was 
determined using the Coulter HIV-1 p24 antigen 
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assay; 25 ng of p24 was loaded per lane on a 4-20% 
SDS-PAGE gel. Proteins were transferred to a PVDF 
membrane and probed with ant i -HIV- 1 antibodies from 
infected HIV-l subtype B patient serum. Trans 
5 complementation with a rev-dependent NI#4 . 3 env was 
included for control. Figure 23B. Infectivity of 
virus particles containing the subtype B concensus 
envelope. Infectivitiy of pseudotyped virus 
containing consensus B gpl60 or gpl4 0 was determined 

10 using the JC53-BL assay. Sucrose cushion purified 
virus particles were assayed by the Coulter p24 
antigen assay, and 5-fold serial dilutions of each 
pellet were incubated with DEAE-Dextran treated 
CTC53-BL cells. Following a 48-hour incubation 

15 period, cells were fixed and stained to visualize p- 
galactosidase expressing cells. Infectivity is 
expressed as infectious units per ng of p24 . 

Figures 24A-24D. Neutralization sensitivity of 
virions containing subtype B consensus gpl60 

20 envelope. Equivalent amounts of pseudovirions 
containing the subtype B consensus or NL4 . 3 Env 
(gp!60) (1,500 infectious units) were preincubated 
with three different monoclonal neutralizing 
antibodies and a panel of plasma samples from HIV-l 

25 wubtype B infected individuals, and then added to 
the JC53-BL cell monolayer in 96-well plates. 
Plates were cultured for two days and luciferase 
activity was measured as an indicator of viral 
infectivity. Virus infectivity was calculated by 
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dividing the lucif erase units (LU) produced at each 
concentration of antibody by the LU produced by the 
control infection. The mean 50% inhibitory 
concentration (IC 50 ) and the actual % neutralization 
at each antibody dilution were then calculated for 
each virus. The results of all lucif erase 
experiments were confirmed by direct counting of 
blue foci in parallel infections. Fig. 24A. 
Neutralization of Pseudovirions containing Subtype B 
consensus Env (gpl60) . Fig. 24B. Neutralization of 
Pseudovirions containing NL4 . 3 Env (gpl60) . 
Fig. 24C. Neutralization of Pseudovirions containing 
Subtype B consensus Env (gpl60) . Fig. 24D. 
Neutralization of Pseudovirions containing NL4 . 3 Env 
(gpl60) . 

Figures 25A and 2 5B. Fig. 25A. Density and p24 
analysis of sucrose gradient fractions. 0.5ml 
fractions were collected from a 20-60% sucrose 
gradient. Fraction number 1 represents the most 
20 dense fraction taken from the bottom of the gradient 
tube. Density was measured with a ref ractometer and 
the amount of p24 in each fraction was determined by 
the Coulter p24 antigen assay. Fractions 6-9 , 10- 
15, 16-21, and 22-25 were pooled together and 
25 analyzed by Western Blot. As expected, virions 
sedimented at a density of 1.16-1.18 g/ml . 
Fig. 25B. VL.P production by co- transf ect ion of 
subtype B consensus gag and env genes. 293T cells 
were co- transf ect ed with subtype B consensus gagr and 
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env genes. Cell supernatant s were harvested 48- 
hours post-transfection, clarified through at 20% 
sucrose cushion, and further purified through a 20- 
60% sucrose gradient. Select fractions from the 
gradient were pooled, added to 2 0ml of PBS, and 
centrifuged overnight at 100,000 x g. Resuspended 
pellets were loaded onto a 4-20% SDS-PAGE gel, 
proteins were transferred to a PVDF membrane, and 
probed with plasma from an HIV-1 subtype B infected 
individual . 

Figures 26A and 26B. Fig. 26A. 2000 Con-S 
140CFI.ENV. Fig. 26B. Codon- optimized Year 2000 
Con-S 140CFI.seq. 

Figure 27. Individual C57BL/6 mouse T cell 
responses to HIV-1 envelope peptides. Comparative 
immunogenicity of CON6 gpl40CFI and Con-S gpl40CFI 
in C57BL/C mice. Mice were immunized with either 
HIV5305 (Subtype A) , 2801 (Subtype B) , CON6 or Con-S 
Envelope genes in DNA prime, rW boost regimens, 5 
mice per group. Spleen cells were assayed for IFN-y 
spot-forming cells 10 days after rW boost, using 
mixtures of overlapping peptides from Envs of HIV-1 
UG37(A), MN(B), Chl9 (C) , 89.6(B) SF162 (B) or no 
peptide negative control. 



Figures 28A-28C. Fig. 28A. 
(841 a. a.). Amino acid sequence 
fusion domain that is deleted in 



Con-B 2 003 Env. pep 
underlined is the 
14 0CF design and 



the "W" underlined is the last amino acid at the 
C- terminus, all amino acids after the "W" are 
deleted in the 140CF design. Fig. 28B. Con-B- 
140CF.pep (632 a. a.). Amino acids in bold identify 
5 the junction of the deleted fusion cleavage site. 
Fig. 2 8C. Codon- optimized Con-B 14 0CF.seq 
(1927 nt . ) . 

Figures 29A-29C. Fig. 29A. CON_OF_CONS-2 003 
(829 a. a.) . Amino acid sequence underlined is the 

10 fusion domain that is deleted in 140CF design and 
the "W" underlined is the last amino acid at the 
C- terminus, all amino acids after the "W M are 
deleted in the 140CF design. Fig. 29B. ConS-2003 
140CF.pep (620 a. a.). Amino acids in bold identify 

15 the junction of the deleted fusion cleavage site. 

Fig. 29C. CODON -OPT I MI ZED ConS-2003 140CF.seq (1891 
nt . ) . 

Figures 30A-3 0C. Fig. 3 OA. CONSENSUS_Al -2 003 
(845 a. a.). Amino acid sequence underlined is the 

20 fusion domain that is deleted in 14 0CF design and 
the "W" underlined is the last amino acid at the 
C-terminus, all amino acids after the "W" are 
deleted in the 140CF design. Fig. 30B. Con-Al-2003 
140CF.pep (629 a. a.). Amino acids in bold identify 

25 the junction of the deleted fusion cleavage site. 
Fig. 30C. CODON- OPTIMIZED Con-Al - 2003 . seq . 
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Figures 31A-31C. Fig. 31A. CONSENSUS_C-2 003 
(835 a. a.) . Amino acid sequence underlined is the 
fusion domain that is deleted in 140CF design and 
the "W" underlined is the last amino acid at the 
5 C- terminus, all amino acids after the 11 W M are 

deleted in the 140CF design. Fig. 31B. Con-C 2003 
140CF.pep (619 a. a.). Amino acids in bold identify 
the junction of the deleted fusion cleavage site. 
Fig. 31C. CODON-OPTIMIZED Con-C-2003 (140 CF (1,888 
10 nt.) . 

Figures 32A-32C. Fig. 32A. CONSENSUS_G-2 003 
(842 a. a.). Amino acid sequence underlined is the 
fusion domain that is deleted in 140CF design and 
the "W" underlined is the last amino acid at the 
15 C- terminus, all amino acids after the "W" are 

deleted in the 140CF design. Fig. 32B. Con-G-2003 
140CF.pep (626 a. a.). Amino acids in bold identify 
the junction of the deleted fusion cleavage site. 
Fig. 32C. CODON-OPTIMIZED Con-G- 2003 . seq . 

2 0 Figures 33A-3 3C. Fig. 3 3A. CONSENSUS_01_AE- 

2003 (854 a. a.). Amino acid sequence underlined is 
the fusion domain that is deleted in 14 0CF design 
and the "W" underlined is the last amino acid at 
the C- terminus, all amino acids after the n W ,f are 

25 deleted in the 140CF design. Fig. 33B. Con-AEOl- 
2003 140CF.pep (638 a. a.). Amino acids in bold 
identify the junction of the deleted fusion cleavage 
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site. Fig. 33C. CODON- OPTIMIZED Con-AEOl-2003 . seq. 
(1945 nt.) . 

Figures 34A-34C. Fig. 34A. Wild- type subtype 
A Env. 00KE_MSA4 076-A (Subtype A, 891 a. a.). Amino 
5 acid sequence underlined is the fusion domain that 
is deleted in 140CF design and the "W n underlined 
is the last amino acid at the C-terminus, all amino 
acids after the "W" are deleted in the 140CF design. 
Fig. 34B. 00KE_MSA4 076 -A 140CF.pep (647 a. a.). 
io Amino acids in bold identify the junction of the 
deleted fusion cleavage site. Fig. 34C. CODON- 
OPTIMIZED 00KE_MSA4 076-A 140CF.seq. (1972 nt . ) . 

Figures 35A-35C. Fig. 35A. Wild-type subtype 
B- QH0515.1g gpl60 (861 a. a.). Amino acid sequence 

15 underlined is the fusion domain that is deleted in 
140CF design and the "W" underlined is the last 
amino acid at the C-terminus, all amino acids after 
the "W" are deleted in the 14 0CF design. Fig. 3 5B. 
QH0515.1g 140CF (651 a. a.). Amino acids in bold 

20 identify the junction of the deleted fusion cleavage 
site. Fig. 35C. CODON-OPTIMIZED QH0515 . lg 
140CF.seq (1984 nt.) . 

Figures 36A-36C. Fig. 36A. Wild- type subtype 
C. DU123.6 gpl60 (854 a. a.). Amino acid sequence 
25 underlined is the fusion domain that is deleted in 
140CF design and the "W ,f underlined is the last 
amino acid at the C-terminus, all amino acids after 
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the "W" aire deleted in the 140CF design. Fig. 36B. 
DU123.6 140CF (638 a. a.). Amino acids in bold 
identify the junction of the deleted fusion cleavage 
site. Fig. 36C. CODON-OPTIMIZED DU123 . 6 140CF.seq 
5 (1945 nt.) . 

Figures 3 7A-37C. Fig. 37A. Wild-type subtype 
CRF01_AE. 97CNGX2F-AE (854 a. a.). Amino acid 
sequence underlined is the fusion domain that is 
deleted in 140CF design and the "W" underlined is 

io the last amino acid at the C- terminus, all amino 

acids after the "W" are deleted in the 140CF design. 
Fig. 37B. 97CNGX2F-AE 140CF.pep (629 a. a.). Amino 
acids in bold identify the junction of the deleted 
fusion cleavage site. Fig. 37C. CODON-OPTIMIZED 

15 97CNGX2F-AE 140CF.seq (1921 nt . ) . 

Figures 38A-38C. Fig. 38A. Wild-type DRCBL-G 
(854 a. a.). Amino acid sequence .underlined is the 
fusion domain that is deleted in 14 0CF design and 
the "W" underlined is the last amino acid at the 

20 C- terminus, all amino acids after the "W" are 

deleted in the 140CF design. Fig. 38B. DRCBL-G 
140CF.pep (630 a.a.). Amino acids in bold identify 
the junction of the deleted fusion cleavage site. 
Fig. 38C. CODON-OPTIMIZED DRCBL-G 14 0CF.seq (1921 

25 nt.) . 
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Figures 3 9A and 3 9B. Fig. 3 9A. 2 0 03 Con-S 
Env. Fig. 39B. 2003 Con-S Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 40A and 40B. Fig. 40A. 2003 M. 
5 Group. Anc Env. Fig. 4 0B. 2 003 M. Group. anc 

Env. seq.opt. (Seq.opt. = codon optimized encoding 
sequence . ) 

Figures 41A and 4 IB. Fig. 41A. 2 003 CON_Al 
Env. Fig. 41B. 2003 CON_Al Env. seq.opt. 
io (Seq.opt. = codon optimized encoding sequence.) 

Figures 42A and 42B. Fig. 42A. 2003 Al.Anc 
Env. Figs. 42B. 2 0 03 Al . anc Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 43A and 4 3B. Fig. 43A. 20 03 CON_A2 
15 Env. Fig. 43B. 2003 CON_A2 Env. seq.opt. 

(Seq.opt. = codon optimized encoding sequence . ) 

Figures 44A and 44B. Fig. 44A. 2003 CONJ 
Env. Fig. 44B. 2003 CON_B Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

20 Figures 45A and 4 5B. Fig. 4 5A. 20 03 B.anc 

Env. Figs. 4 5B. 2 003 B.anc Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 
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Figures 46A and 4 6B. Fig. 46A. 2 003 CON_C 
Env. Fig. 4 6B. 2 003 CON_C Env. seq.opt . 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 4 7A and 4 7B. Fig. 47A. 2003 Cane 
Env. Fig. 4 7B. 2 003 C.anc Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 4 8A and 4 8B. Fig. 4 8A. 2003 CON_D 
Env. Fig. 48B. 2003 CON_D Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence . ) 

Figures 4 9A and 49B. Fig. 4 9A. 2003 CON_Fl 
Env. Fig. 49B. 2003 CON_Fl Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 50A and SOB. Fig. 50A. 2003 CONJF2 
Env. Fig. 5 0B. 2 003 CON_F2 Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 51A and 51B. Fig. 51A. 2003 CONJ3 
Env. Fig. 5 IB. 2 003 CON_G Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 52A and 52B. Fig. 52A. 2 003 CON_H 
Env. Fig. 52B. 2003 CON_H Env. seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 



Figures 53A and 53B. Fig. 53A. 2 003 CON_01_AE 
Env. Fig. 53B. 2003 CON_01_AE Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 54A and 54B. Fig. 54A. 2 003 CON_02_AG 
Env. Fig. 54B. 2 0 03 CON_02__AG Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 55A and 55B. Fig. 55A. 2 003 CON_03_AB 
Env. Fig. 55B. 2003 CON_03_AB Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 56A and 56B. Fig. 56A. 2003 
CON_04_CPX Env. Fig. 56B. 2003 CON_04_CPX 
Env.seq.opt. (Seq.opt- = codon optimized encoding 
sequence . ) 

Figures 57A and 57B. Fig. 57A. 2 003 
CON_06_CPX Env. Fig. 57B. 2003 CON_06_CPX 
Env.seq.opt. (Seq.opt. = codon optimized encoding 
sequence . ) 

Figures 58A and 58B. Fig. 58A. 2 003 CON_08_BC 
Env. Fig. 58B. 2 003 CON_08_BC Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 59A and 59B. Fig. 59A. 2003 CON_10_CD 
Env. Fig. 59B. 2003 CON_10_CD Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 
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Figures 60A and 60B. Fig. 60A. 2003 
CON_ll_CPX Env. Fig. 6 OB. 2 003 CON_ll_CPX 
Env.seq.opt. (Seq.opt. = codon optimized encoding 
sequence . ) 

5 Figures 61A and 61B. Fig. 61A. 2003 CON_12_BF 

Env. Fig. 61B. 2 003 CON_12_BF Env.seq.opt. 
(Seq.opt. = codon optimized encoding sequence.) 

Figures 62A and 62B. Fig. 62A. 2 003 CON_14_BG 
Env. Fig. 62B. 2003 CON_14_BG Env.seq.opt. 
io (Seq.opt. = codon optimized encoding sequence.) 

Figures 63A and 63B. Fig. 63A. 2 003_CON_S 
gag. PEP. Fig. 63B. 2 003_CON_S gag. OPT. 
(OPT = codon optimized encoding sequence.) 



Figures 64A and 64B. 
15 2 003_M.GROUP.anc gag. PEP. 
2 00 3_M . GROUP . anc gag . OPT . 
encoding sequence . ) 



Fig. 64A. 
Fig. 64B. 

(OPT = codon optimized 



Figures 65A-65D. Fig. 65A. 2003__CON_A1 
gag. PEP. Fig. 65B. 2003_CON_A1 gag. OPT. Fig. 65C. 
20 2 003_Al.anc gag. PEP. Fig. 65D. 2003_Al.anc 

gag. OPT. (OPT = codon optimized encoding sequence.) 
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Figures 66A and 66B. Fig. 66A. 2 00 3CON_A2 
gag. PEP. Fig. 66B. 2 003_CON_A2 gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 67A-67D. Fig. 67A. 2 003_CON_B 
5 gag. PEP. Fig. 67B. 2 003_CON_B gag. OPT. Fig. 67C, 
2003_B.anc gag. PEP. Fig. 67D. 2003_B.anc gag. OPT, 
(OPT = codon optimized encoding sequence.) 

Figures 68A-68D. Fig. 68A. 2 003_CON_C 
gag. PEP. Fig. 68B. 2 003_CON_C gag. OPT. Fig. 68C, 
10 2 003_C.anc.gag.PEP. Fig. 68D. 2003_C . anc . gag . OPT . 
(OPT = codon optimized encoding sequence.) 

Figures" 69A and 69B. Fig. 69A. 2003_CON_D 
gag. PEP. Fig. 69B. 2 003_CON_D gag. OPT. 
(OPT = codon optimized encoding sequence.) 

15 Figures 7 OA and 7 OB. Fig. 7 OA. 2 003_CON_F 

gag. PEP. Fig. 7 OB. 2003_CON_F gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 71A and 7 IB. Fig. 71A. 2003_CON_G 
gag. PEP. Fig. 7 IB. 2 003_CON_G gag. OPT. 
20 (OPT = codon optimized encoding sequence.) 

Figures 72A and 72B. Fig. 72A. 2003_CON_H 
gag. PEP. Fig. 72B. 2 003_CON_H gag. OPT. 
(OPT = codon optimized encoding sequence.) 



28 



874388 



Figures 73A and 73B. Fig. 73A. 2003_CON_K 
gag. PEP. Fig. 73B. 2003_CON_K gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 74A and 74B. Fig. 74A. 2 003_CON_01_AE 
gag . PEP . Fig . 7B . 2 00 3_CON_0 1_AE gag . OPT . 
(OPT = codon optimized encoding sequence.) 

Figures 75A and 75B. Fig. 75A. 2 0 03_CON_02_AG 
gag. PEP. Fig. 75B. 2 003_CON_02_AG gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 76A and 76B. Fig. 76A. 
2 003_CON_03_ABG gag . PEP . Fig. 76B. 2 003_CON_03_ABG 
gag. OPT. (OPT = codon optimized encoding sequence.) 

Figures 77A and 77B. Fig. 77A. 
2 003_CON_04_CFX gag . PEP . Fig. 77B. 2 003 CON_04_CFX 
gag. OPT. (OPT = codon optimized encoding sequence . ) 

Figures 78A and 78B. Fig. 78A. 
2 003_CON_06_CPX gag . PEP . Fig. 78B. 2 003_CON_06_CPX 
gag. OPT. (OPT = codon optimized encoding sequence.) 

Figures 79A and 79B. Fig. 79A. 2 003_CON_07_BC 
gag. PEP. Fig. 7 9B. 2 003_CON_07_BC gag. OPT. 
(OPT = codon optimized encoding sequence.) 
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Figures 80A and 80B. Fig. 80A. 2 0 03_CON_08_BC 
gag. PEP. Fig. 80B. 2 003_CON_08_BC gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 81A and 8 IB. Fig. 81A. 2 003_CON_10_CD 
gag. PEP. Fig. 8 IB. 2 003_CON_10_CD gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 82A and 82B. Fig. 82A. 
2 003_CON_11_CPX gag. PEP. Fig. 82B. 2003_CON_11_CPX 
gag. OPT. (OPT = codon optimized encoding sequence.) 

Figures 83A and 83B. Fig. 83A. 
2003_CON_ 12_BF.gag.PEP. Fig. 83B. 

2 003_CON_12_BF.gag.OPT. (OPT = codon optimized 
encoding sequence . ) 

Figures 84A and 84B. Fig. 84A. 2 0 03_CON_14_BG 
gag. PEP. Fig. 84B. 2 003_CON_14_BG gag. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 8 5A and 8 5B. Fig. 85A. 2 0 03_CONS 
nef.PEP. Fig. 8 5B. 2 003_CONS nef.OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 86A and 86B. Fig. 86A. 2003_M 
GROUP. anc nef.PEP. Fig. 86B. 2 003_M 

GROUP. anc. nef.OPT. (OPT = codon optimized encoding 
sequence . ) 
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Figures 87A and 87B. Fig. 87A. 2003_CON_A 
nef .PEP. Fig. 87B. 2 003_CON_A nef .OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 88A-88D. Fig. 88A. 2 003_CON_A1 
nef. PEP. Fig. 88B. 2 003_CON_A1 nef .OPT. Fig. 88C. 
2003_Al.anc nef .PEP. Fig. 88D. 2003_Al.anc 
nef .OPT. (OPT = codon optimized encoding sequence.) 

Figures 8 9A and 89B. Fig. 8 9A. 20 03_CON_A2 
nef. PEP. Fig. 89B. 2003_CON_A2 nef. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 90A-90D. Fig. 90A. 2003_CON_B 
nef. PEP. Fig. 90B. 2 003_CON-B nef. OPT. Fig. 90C. 
2003_B.anc nef. PEP. Fig. 90D. 2 003_B.anc nef .OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 91A and 91B. Fig. 91A. 2003_CON_02_AG 
nef .PEP. Fig. 91B. 2 003_CON_02_AG nef .OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 92A-92D. Fig. 92A. 2003_CON_C 
nef .PEP. Fig. 92B. 2003_CON_C nef .OPT. Fig. 92C. 
2003_C.anc nef. PEP. Fig. 92D. 2003_C.anc nef .OPT. 
(OPT = codon optimized encoding sequence.) 
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Figures 93A and 93B. Fig. 93A. 2 0 03_CON_D 
nef .PEP. Fig. 93B. 2003_CON__D nef.OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 94A and 94B. Fig. 94A. 2 003_CON_F1 
5 nef. PEP. Fig. 94B. 2003_CON__ Fl nef.OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 95A and 95B. Fig. 95A. 2003_CON_F2 
nef. PEP. Fig. 95B. 2003_CON_F2 nef.OPT. 
(OPT = codon optimized encoding sequence.) 

10 Figures 96A and 96B. Fig. 96A. 2003_CON_G 

nef .PEP. Fig. 96B. 2003_CON_G nef.OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 97A and 97B. Fig. 97A. 2003_CON_H 
nef. PEP. Fig. 97B. 2003_CON_H nef.OPT. 
15 (OPT = codon optimized encoding sequence.) 

Figures 98A and 98B. Fig. 98A. 2 00 3_CON_0 1_AE 
nef. PEP. Fig. 98B. 2 003_CON_01_AE nef.OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 99A and 99B. Fig. 99A. 2 003_CON_03_AE 
20 nef. PEP. Fig. 9 9B. 2 00 3_CON_0 3 _AE nef.OPT. 
(OPT = codon optimized encoding sequence.) 
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Figures 100A and 100B. Fig. 10OA. 
2003_CON_04_CFX nef .PEP. Fig. 10 OB. 
2 003_CON_04_CFX nef .OPT. (OPT = codon optimized 
encoding sequence . ) 



5 Figures 101A and 10 IB. Fig. 101A. 

2003_CON_06_CFX nef. PEP. Fig. 101B. 
2 003_CON_06_CFX nef .OPT. (OPT = codon optimized 
encoding sequence . ) 



Figures 102A and 102B. Fig. 102A. 
10 2003_CON_08_BC nef. PEP. Fig. 102B. 2003_CON_08_BC 
nef. OPT. (OPT = codon optimized encoding sequence.) 



Figures 103A and 103B. Fig. 103A. 
2 003_CON_10_CD nef .PEP, Fig. 103B. 2 0 03_CON_10_CD 
nef .OPT. (OPT = codon optimized encoding sequence.) 



15 Figures 104A and 104B. Fig. 104A. 

2 003_CON_11_CFX nef .PEP. Fig. 104B. 
2 003_CON_11_CFX nef .OPT. (OPT = codon optimized 
encoding sequence . ) 



Figures 105A and 105B. Fig. 105A. 
20 2003_CON_12_BF nef .PEP. Fig. 105B. 2 0 03_CON_12_BF 
nef. OPT. (OPT = codon optimized encoding sequence.) 
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Figures 106A and 106B. Fig. 106A. 
2 003_CON_14_BG nef . PEP . Fig. 106B. 2 0 03_CONF_14_BG 
nef.OPT. (OPT = codon optimized encoding sequence,) 



Figures 107A and 107B. Fig. 107A. 2 003_CON_S 
5 pol.PEP. Fig. 107B. 2003__CON_S pol . OPT . 
(OPT = codon optimized encoding sequence.) 



Figures 108A and 108B. Fig. 108A. 2003_M 
GROUP anc pol.PEP. Fig. 108B. 2003_M. GROUP anc 
pol. OPT. (OPT = codon optimized encoding sequence.) 

10 Figures 109A-109D. Fig. 109A. 2 003_CON_A1 

pol.PEP. Fig. 109B. 2003_CON_A1 pol. OPT. 
Fig. 109C. 2003_Al.anc pol.PEP. Fig. 109D. 
2003_Al.anc pol. OPT. (OPT = codon optimized 
encoding sequence . ) 



15 Figures 110A and HOB. Fig. 110A. 2003_CON_A2 

pol.PEP. Fig. 11 OB. 2 003_CON_A2 pol. OPT. 
(OPT = codon optimized encoding sequence.) 



Figures 111A-111D. Fig. 111A. 2003_CON_B 
pol.PEP. Fig. 111B. 2003_CON_B pol .OPT. Fig. 
20 111C. 2003_B.anc pol.PEP. Fig. HID. 2003_B.anc 

pol. OPT. (OPT - codon optimized encoding sequence.) 



Figures 112A-112D. Fig. 112A. 2003_CON_C 
pol.PEP. Fig. 112B. 2003_CON_C pol. OPT. 
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Fig. 112C. 2003_C.anc pol . PEP . Fig. 112D. 

20 03_C . anc pol. OPT. (OPT = codon optimized encoding 

sequence . ) 

Figures 113A and 113B. Fig. 113A. 2003_CON_D 
5 pol. PEP. Fig. 113B. 2003_CON_D pol . OPT . 
(OPT = codon optimized encoding sequence.) 

Figures 114A and 114B. Fig. 114A. 2003_CON_F1 
pol. PEP. Fig. 114B. 2 003_CON_F1 pol . OPT . 
(OPT = codon optimized encoding sequence.) 

10 Figures 115A and 115B. Fig. 115A. 2003_CON_F2 

pol. PEP. Fig. 115B. 2003_CON_F2 pol. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 116A and 116B. Fig. 116A. 2003_CON_G 
pol. PEP. Fig. 116B. 2 003_CON_J3 pol .OPT. 
15 (OPT = codon optimized encoding sequence.) 

Figures 117A and 117B. Fig. 117A. 2003CONH 
pol. PEP. Fig. 117B. 2003__CON_H pol. OPT. 
(OPT = codon optimized encoding sequence.) 

Figures 118A and 118B. Fig. 118A. 
20 2 0 0 3_CON_0 1__AE pol. PEP. Fig. 118B. 2 003_CON_01_AE 
pol. OPT. (OPT = codon optimized encoding sequence.) 
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Figures 119A and 119B. Fig. 119A. 
2003_CON_02_AG pol . PEP . Fig. 119B. 2 0 03_CON_02_AG 
pol.OPT. (OPT = codon optimized encoding sequence.) 



Figures 120A and 120B. Fig. 120A. 
2003_CON_03_AB pol . PEP . Fig. 120B. 2 0 03_CON_03_AB 
pol.OPT. (OPT = codon optimized encoding sequence.) 



Figures 12 1A and 12 IB. Fig. 12 1A. 
2003_CON_04_CPX pol . PEP . Fig. 12 IB. 
2003_CON_04_CPX pol.OPT. (OPT = codon optimized 
10 encoding sequence . ) 



Figures 122A and 122B. Fig. 122A. 
2003_CON_06_CPX pol . PEP . Fig. 12 2B. 
2003_CON_06_CPX pol.OPT. (OPT = codon optimized 
encoding sequence . ) 



15 Figures 123A and 123B. Fig. 123A. 

2003_CON_08_BC pol . PEP . Fig. 123B. 2 0 03_CON_08_BC 
pol.OPT. (OPT = codon optimized encoding sequence.) 



Figures 124A and 124B. Fig. 124A. 
2003_CON_10_CD pol. PEP. Fig. 124B. 2 003_CON_10_CD 
20 pol.OPT. (OPT = codon optimized encoding sequence.) 



Figures 125A and 125B. Fig. 125A. 
2 003_CON_11_CPX pol. PEP. Fig. 125B. 
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2 003_CON_11_CPX pol.OPT. (OPT = codon optimized 
encoding sequence . ) 

Figures 126A and 126B. Fig. 126A. 
2003_CON_12_BF pol . PEP . Fig. 12 6B. 2 003 CON_12BF 
5 pol.OPT, (OPT = codon optimized encoding sequence . ) 

Figures 12 7A and 12 7B. Fig. 12 7A. 
2003_CON_14_BG pol . PEP . Fig. 127B. 2 003_CON_14_BG 
pol.OPT. (OPT = codon optimized encoding sequence.) 

DETAILED DESCRIPTION OF THE INVENTION 

10 The present invention relates to an immunogen 

that induces antibodies that neutralize a wide 
spectrum of human immunodeficiency virus (HIV) 
primary isolates and/or that induces a T cell 
response. The immunogen comprises at least one 

15 consensus or ancestral immunogen (e.g., Env, Gag, 
Nef or Pol) , or portion or variant thereof. The 
invention also relates to nucleic acid sequences 
encoding the consensus or ancestral immunogen, or 
portion or variant thereof. The invention further 

20 relates to methods of using both the immunogen and 
the encoding sequences. While the invention is 
described in detail with reference to specific 
consensus and ancestral immunogens (for example, to 
a group M consensus Env) , it will be appreciated 

25 that the approach described herein can be used to 
generate a variety of consensus or ancestral 
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immunogens (for example, envelopes for other HIV-1 
groups (e.g., N and O) ) . 

In accordance with one embodiment of the 
invention, a consensus env gene can be constructed 
5 by generating consensus sequences of env genes for 
each subtype of a particular HIV-1 group (group M 
being classified into subtypes A-D, F-H, J an K) , 
for example, from sequences in the Los Alamos HIV 
Sequence Database (using, for example, MASE 

10 (Multiple Aligned Sequence Editor) ) . A consensus 
sequence of all subtype consensuses can then be 
generated to avoid heavily sequenced subtypes 
(Gaschen et al. Science 296:2354-2360 (2002), Korber 
et al, Science 288:1789-1796 (2000)). In the case 

is of the group M consensus env gene described in 

Example 1 (designated CON6) , five highly variable 
regions from a CRF08_BC recombinant strain (98CN006) 
(VI, V2 , V4, V5 and a region in cytoplasmic domain 
of gp41) are used to fill in the missing regions in 

20 the sequence (see, however, corresponding regions 
for Con-S) . For high levels of expression, the 
codons of consensus or ancestral genes can be 
optimized based on codon usage for highly expressed 
human genes (Haas et al, Curr. Biol. 6:315-324 

25 (2000), Andre et al, J. Virol. 72:1497-1503 (1998)). 

With the Year 1999 consensus group M env gene, 
CON6, it has been possible to demonstrate induction 
of superior T cell responses by CON6 versus wild- 
type B and C env by the number of ELI SPOT 

30 y- interferon spleen spot forming cells and the 



38 



874388 



numb e it of epitopes recognized in two strains of mice 
(Tables 1 and 2 show the data in BALB/c mice) . The 
ability of CON6 Env protein to induce neutralizing 
antibodies to HIV-1 primary isolates has been 
compared to that of several subtype B Env. The 
target of neutralizing antibodies induced by CON6 
includes several non-B HIV-1 strains. 



Table 1 . T cell epitope mapping of CON6, JRFL and 96ZM651 
Env immunogen In BALB/c mice 

p eDtid e Immunogen ~ 

K CON8 JRFL (B) 98ZM651 (C) «»ponse 
CON 6 (group M consensus) 

18 DTEVHNVWATHACVP + + CD4 

48 KNSSEYYRUNCMTS j. A 

49 EYYRUNCNTSAfTQ * ^ CD4 

§ "^ISS^agp + CD4 

02 NVSTVOCTMGtXPW + CD4 

104 ETITLPCPIKQaNM . „_ 

106 LPCRl K QIIN VfWQGV + CO 8 

+ CD4 

134 AQQKLLQLTVWGDCQLO + CD4 

135 LQL7VWGOCQLC2ARVL 



Subtype D (MM) 

0223 AKAYDTEVMNVWATO 

0224 DTEVKNVWATQACVP 



+ CD4 

+ CD4 

0280 RKFUH1CPGRAFYTT + CDS 

0287 HK5PGHAFYTTKNH ^ ° 



8347 



CD4 



Subtype C(Chn19) 

4834 VPVWKEAKTTLFCASOAKSY + CD4 

4838 CKEVMNVWATMACVPTT3PNP + + CD4 

4848 SSENSSEYYRLWCWTSAJT + + CD4 

4854 STVQCTMG IKPWS TQU-LN + CD4 

4884 QQSNLLRAJEAOQHLijQLTV + CD4 

+ CD4 
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Table 2. T cell epitope mapping of CON6.gpl20 
immunogen in C57BL/6 mice 



Peptide 



Peptide sequence 



T cell response 



CON 6 (consensus) 
2 
3 

16 
53 
97 
99 



G I QRNCQHL WRWGTM 

NCQHLWRWGTM I LGM 

DTEVHNVWATHACVP 

CPKVSFEPIPIHYCA 

FYCNTSGLFNSTWMF 

FNSTWMFNGTYMFNG 



CD 8 

CD4 
CD4 
CD 8 
CD 8 



Subtype B (MN) 
6210 
6211 

6232 

6262 

6290 
6291 



G I RRNYQHWWG WGTM 

NYQHWWGWGTMLIiGL 

NMWKNNMVEQMHED I 

ISFEPIPIHYCAPAG 

NIIGTIRQAHCNISR 

TIRQAHCNISRAKWN 



CD 8 

CD4 
CD 4 
CD 4 



Subtype C (Chn 19) 
4830 

5446 

4836 

4862 

4888 



MRVTG I RKNYQHL WR WGTML 
RWGTMLLGMLM I CS AAEN 
GKEVHNVWATHACVPTDPNP 
GD IRQAHCN I SKDKWNETLQ 
LL.GIWGCSGKLICTTTVPWN 



CD 8 
CD 8 
CD 4 
CD 4 
CD 8 



For the Year 2 000 consensus group M env gene, 
5 Con- S , the Con-S envelope has been shown to be as 
immunogenic as the CON6 envelope gene in T cell y 
interferon ELISPOT assays in two strains of mice 
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(the data for C57BL/6 are shown in Fig. 27) . 
Furthermore, in comparing CON6 and Con-S gpl40 Envs 
as protein immunogens for antibody in guinea pigs 

(Table 3), both gpl40 Envs were found to induce 
5 antibodies that neutralized subtype B primary 

isolates. However, Con-S gpl40 also induced robust 
neutralization of the subtype C isolates TV- 1 and DU 
123 as well as one subtype A HIV-1 primary isolate, 
while CON6 did not. 
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As the next iteration of consensus immunogens, 
and in recognition of the fact that a practical HIV- 

I immunogen can be a polyvalent mixture of either 
5 several subtype consensus genes , a mixture of 

subtype and consensus genes, or a mixture of 
centralized genes and wild type genes, a series of 

II subtype consensus, and wild type genes have been 
designed from subtypes A, B, C, CRF AE01, and G as 

10 well as a group M consensus gene from Year 2 003 Los 
Alamos National Database sequences. The wild type 
sequences were chosen either because they were known 
to come from early transmitted HIV-1 strains (those 
strains most likely to be necessary to be protected 

15 against by a vaccine) or because they were the most 
recently submitted strains in the database of that 
subtype. These nucleotide and amino acid sequences 
are shown in Figures 28-38 (for all 140CF designs 
shown, 140CF gene can be flanked with the 5* 

20 sequence " TTCAGTCGACGGCCACC " that contains a Kozak 
sequence (GCCACCATGG/A) and Sail site and 3 1 
sequence of TAAAGATCTTACAA containing stop codon and 
Bgrlll site) . Shown in Figures 39-62 are 2003 
centralized (consensus and ancestral) HIV-1 envelope 

25 proteins and the codon optimized gene sequences. 

Major differences between CON6 gpl40 (which 
does not neutralize non-clade B HIV strains) and 
Con-S gpl4 0 (which does induce antibodies that 
neutralize non-clade B HIV strains) are in Con-S VI , 

30 V2, V4 and V5 regions. For clade B strains, 

peptides of the V3 region can induce neutralizing 
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antibodies (Haynes et al, J. Immunol. 151:1646-1653 
(1993)). Thus, construction of Th-Vl, Th-V2 , Th- 
V4, Th-V5 peptides can be expected to give rise to 
the desired broadly reactive anti-non-clade B 
5 neutralizing antibodies. Therefore, the Th-V 

peptides set forth in Table 4 are contemplated for 
use as a peptide immunogen(s) derived from Con-S 
gpl40. The gag Th determinant (GTH, Table 4) or any- 
homologous GTH sequence in other HIV strains, can be 

10 used to promote immunogenicity and the C4 region of 
HIV gpl2 0 can be used as well ( KQ I I NMWQ WGKAMYA ) or 
any homologous C4 sequence from other HIV strains 
(Haynes et al , J. Immunol. 151:1646-1653 (1993)). 
Con-S VI, V2, V4, V5 peptides with an N- terminal 

15 helper determinant can be used singly or together, 
when formulated in a suitable adjuvant such as 
Corixa's RC529 (Baldridge et al , J. Endotoxin Res. 
8:453-458 (2002)), to induce broadly cross reactive 
neutralizing antibodies to non-clade B isolates. 

20 
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Table 4 








1) 


GTH Con-S VI 132-150 


YKRV\^ILGLNKIVRMYT^^JVN^/T^^^^TN^^~EEKGEIKN 


2) 


GTH Con-S V2 157-189 


YKRWIILGLNKIVRMYTeRDKKQKVYALFYRLDWPIDDNNNNSSNYR 


3) 


GTH Con-S V3 294-315 


YKRWIILGLNKIVRMYTRPNNNTRKSIRIGPGQAFYAT 


4) 


GTH Con-S V4 381-408 


YKRWIILG LN KIVRMYNTSGLFNSTWIG NGTKN N N NTN DTTTLP 


5) 


GTH Con-5 V5 447-4o6 


YKRN/VIILGLNKIVRMYRDGGNNNTNETEIFRPGGGD 


6) 


GTH Con-6 VI 132-150 


YKRWIILG LNKIVRMYNVRNVSSNGTETDNEEIKN 


7) 


GTH Con-6 V2 157-196 


YKRWILGUYKWRMYTELRDKKQKM 


8) 


GTH-Con6 V3 301-322 


YKRVVIILGLNKIVRMYTRPNNIsrTRKSIHIGPGQAFYAT 


9) 


GTH Con-6 V4 388^18 


Y KRWH LG LN KIVRM YNTSG LFNSTWM FNGTYM F NGTrKDNSETTTLP 


10 


GTH Con 6 V5 457-477 


YKRWIILG LNKIVRMYRDGGNNSNKNKTETFRPGGGD 



It will be appreciated that the invention 
includes portions and variants of the sequences 
specifically disclosed herein. For example, forms 
5 of codon optimized consensus encoding sequences can 
be constructed as gpl40CF, gp!40 CFI , gpl20 or gpl60 
forms with either gpl20/41 cleaved or uncleaved. 
For example, and as regards the consensus and 
ancestral envelope sequences, the invention 

10 encompasses envelope sequences devoid of V3 . 

Alternatively, V3 sequences can be selected from 
preferred sequences, for example, those described in 
U.S. Application No. 10/431,596 and U.S. Provisional 
Application No. 60/471,327. In addition, an optimal 

15 immunogen for breadth of response can include 

mixtures of group M consensus gag, pol, nef and env 
encoding sequences, and as well as consist of 
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mixtures of subtype consensus or ancestral encoding 
sequences for gracj, pol, nef and env HIV genes. For 
dealing with regional differences in virus strains, 
an efficacious mixture can include mixtures of 
5 consensus/ancestral and wild type encoding 
sequences . 

A consensus or ancestral envelope of the 
invention can be been "activated" to expose 
intermediate conformations of neutralization 

10 epitopes that normally are only transiently or less 
well exposed on the surface of the HIV virion. The 
immunogen can be a "frozen" triggered form of a 
consensus or ancestral envelope that makes available 
specific epitopes for presentation to B lymphocytes. 

15 The result of this epitope presentation is the 

production of antibodies that broadly neutralize 
HIV. (Attention is directed to WO 02/024149 and to 
the activated/triggered envelopes described 
therein . ) 

20 The concept of a fusion intermediate immunogen 

is consistent with observations that the gp41 HR-2 
region peptide, DP178, can capture an uncoiled 
conformation of gp41 (Furata et al, Nature Struct. 
Biol. 5:276 (1998)), and that formalin- fixed HIV- 

25 infected cells can generate broadly neutralizing 

antibodies (LaCasse et al, Science 283:357 (1997)). 
Recently a monoclonal antibody against the coiled- 
coil region bound to a conformational determinant of 
gp41 in HR1 and HR2 regions of the coiled-coil gp41 

30 structure, but did not neutralize HIV (Jiang et al , 

J . Virol. 10213 (1998)). However, this latter study 
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proved that the coiled-coil region is available for 
antibody to bind if the correct antibody is 
generated. 

The immunogen of one aspect of the invention 
5 comprises a consensus or ancestral envelope either 
in soluble form or anchored, for example, in cell 
vesicles or in liposomes containing translipid 
bilayer envelope. To make a more native envelope, 
gpl4 0 or gpl6 0 consensus or ancestral sequences can 

10 be configured in lipid bilayers for native trimeric 
envelope formation. Alternatively, triggered gpl60 
in aldrithio 1-2 inactivated HIV-1 virions can be 
used as an immunogen. The gpl6 0 can also exist as a 
recombinant protein either as gpl6 0 or gpl4 0 (gpl4 0 

15 is gpl60 with the transmembrane region and possibly 
other gp41 regions deleted) . Bound to gpl60 or 
gpl4 0 can be recombinant CCR5 or CXCR4 co-receptor 
proteins (or their extracellular domain peptide or 
protein fragments) or antibodies or other ligands 

20 that bind to the CXCR4 or CCR5 binding site on 

gpl20, and/or soluble CD4 , or antibodies or other 
ligands that mimic the binding actions of CD4 . 
Alternatively, vesicles or liposomes containing CD4 , 
CCR5 (or CXCR4) , or soluble CD4 and peptides 

25 reflective of CCR5 or CXCR4 gpl20 binding sites. 

Alternatively, an optimal CCR5 peptide ligand can be 
a peptide from the N- terminus of CCR5 wherein 
specific tyrosines are sulfated (Bormier et al , 
Proc. Natl. Acad. Sci . USA 97:5762 (2001)). The 

30 triggered immunogen may not need to be bound to a 

membrane but may exist and be triggered in solution. 
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Alternatively, soluble CD4 (sCD4) can be replaced by 
an envelope (gp!4 0 or gpl60) triggered by CD4 
peptide mimetopes (Vitra et al, Proc. Natl. Acad. 
Sci. USA 96:1301 (1999)). Other HIV co-receptor 
5 molecules that "trigger" the gpl60 or gpl40 to 

undergo changes associated with a structure of gpl60 
that induces cell fusion can also be used. Ligation 
of soluble HIV gpl40 primary isolate HIV 89.6 
envelope with soluble CD4 (sCD4) induced 

10 conformational changes in gp41. 

In one embodiment, the invention relates to an 
immunogen that has the characteristics of a receptor 
(CD4) -ligated consensus or ancestral envelope with 
CCR5 binding region exposed but unlike CD4- ligated 

15 proteins that have the CD4 binding site blocked, 
this immunogen has the CD4 binding site exposed 
(open) . Moreover, this immunogen can be devoid of 
host CD4 , which avoids the production of potentially 
harmful anti-CD4 antibodies upon administration to a 

20 host. 

The immunogen can comprise consensus or 
ancestral envelope ligated with a ligand that binds 
to a site on gpl20 recognized by an A3 2 monoclonal 
antibodies (mab) (Wyatt et al , J". Virol. 69:5723 

25 (1995), Boots et al, AIDS Res. Hum. Retro. 13:1549 
(1997), Moore et al, J. Virol. 68:8350 (1994), 
Sullivan et al, J. Virol. 72:4694 (1998), Fouts et 
al, J. Virol. 71:2779 (1997), Ye et al , J. Virol. 
74:11955 (2000)) . One A32 mab has been shown to 

30 mimic CD4 and when bound to gpl2 0, upregulates 

(exposes) the CCR5 binding site (Wyatt et al , J. 
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Virol. 69:5723 (1995)). Ligation of gpl20 with such 
a ligand also upregulates the CD4 binding site and 
does not block CD4 binding to gpl20. 

Advantageously, such ligands also upregulate the HR- 
5 2 binding site of gp41 bound to cleaved gpl20, 

uncleaved gpl4 0 and cleaved gp41, thereby further 
exposing HR-2 binding sites on these proteins - each 
of which are potential targets for ant i -HIV 
neutral i zing antibodies . 

io In a specific aspect of this embodiment, the 

immunogen comprises soluble HIV consensus or 
ancestral gpl2 0 envelope ligated with either an 
intact A32 mab, a Fab2 fragment of an A32 mab, or a 
Fab fragment of an A3 2 mab, with the result that the 

15 CD 4 binding site, the CCR5 binding site and the HR-2 
binding site on the consensus or ancestral envelope 
are exposed/upregulated . The immunogen can comprise 
consensus or ancestral envelope with an A3 2 mab (or 
fragment thereof) bound or can comprise consensus or 

20 ancestral envelope with an A32 mab (or fragment 

thereof) bound and cross-linked with a cross-linker 
such as .3% formaldehyde or a heterobi functional 
cross- linker such as DTSSP (Pierce Chemical 
Company) . The immunogen can also comprise uncleaved 

25 consensus or ancestral gpl40 or a mixture of 

uncleaved gpl40, cleaved gp41 and cleaved gpl20 . An 
A3 2 mab (or fragment thereof) bound to consensus or 
ancestral gpl4 0 and/or gpl20 or to gpl2 0 non- 
covalently bound to gp41, results in upregulation 

30 (exposure) of HR-2 binding sites in gp41, gp!20 and 
uncleaved gpl40. Binding of an A32 mab (or fragment 
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thereof) to gpl20 or gpl40 also results in 
upregulation of the CD4 binding site and the CCR5 
binding site. As with gp!20 containing complexes, 
complexes comprising uncleaved gpl4 0 and an A32 mab 
5 (or fragment thereof) can be used as an immunogen 
uncross -linked or cross -linked with cross -linker 
such as .3% f ormaldehyde or DTSSP. In one 
embodiment, the invention relates to an immunogen 
comprising soluble uncleaved consensus or ancestral 

10 gpl4 0 bound and cross linked to a Fab fragment or 

whole A3 2 mab, optionally bound and cross -linked to 
an HR-2 binding protein. 

The consensus or ancestral envelope protein 
triggered with a ligand that binds to the A32 mab 

15 binding site on gpl2 0 can be administered in 
combination with at least a second immunogen 
comprising a second envelope, triggered by a ligand 
that binds to a site distinct from the A32 mab 
binding site, such as the CCR5 binding site 

20 recognized by mab 17b, The 17b mab (Kwong et al, 
Nature 393:648 (1998) available from the AIDS 
Reference Repository, NIAID, NIH) augments sCD4 
binding to gpl20. This second immunogen (which can 
also be used alone or in combination with triggered 

25 immunogens other than that described above) can, for 
example, comprise soluble HIV consensus or ancestral 
envelope ligated with either the whole 17b mab, a 
Fab2 fragment of the 17b mab, or a Fab fragment of 
the 17b mab. It will be appreciated that other CCR5 

30 ligands, including other antibodies (or fragments 

thereof) , that result in the CD4 binding site being 
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exposed can be used in lieu of the 17b mab. This 
further inununogen can comprise gpl20 with the 17b 
mab, or fragment thereof, (or other CCR5 ligand as 
indicated above) bound or can comprise gpl2 0 with 
5 the 17b mab, or fragment thereof, (or other CCR5 
ligand as indicated above) bound and cross -linked 
with an agent such as .3% formaldehyde or a 
heterobif unctional cross-linker, such as DTSSP 
(Pierce Chemical Company) . Alternatively, this 

io further inununogen can comprise uncleaved gpl4 0 

present alone or in a mixture of cleaved gp41 and 
cleaved gpl20. Mab 17b, or fragment thereof (or 
other CCR5 ligand as indicated above) bound to gpl40 
and/or gpl2 0 in such a mixture results in exposure 

15 of the CD4 binding region. The 17b mab, or fragment 
thereof, (or other CCR5 ligand as indicated above) 
gpl4 0 complexes can be present uncross -linked or 
cross -linked with an agent such as .3% formaldehyde 
or DTSSP. 

20 Soluble HR-2 peptides, such as T64 9Q26L and 

DP178, can be added to the above -described complexes 
to stabilize epitopes on consensus gpl20 and gp41 as 
well as uncleaved consensus gpl4 0 molecules, and can 
be administered either cross -linked or uncross - 

25 linked with the complex. 

A series of monoclonal antibodies (mabs) have 
been made that neutralize many HIV primary isolates, 
including, in addition to the 17b mab described 
above, mab IgGlbl2 that binds to the CD4 binding 

30 site on gpl20 (Roben et al , J". Virol. 68:482 (1994), 
Mo et al, J. Virol. 71:6869 (1997)), mab 2G12 that 
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binds to a conformational determinant on gp!2 0 
(Trkola et al, J. Virol. 70:1100 (1996)), and mab 
2F5 that binds to a membrane proximal region of gp41 
(Muster et al, J. Virol, 68:4031 (1994)). 
5 As indicated above, various approaches can be 

used to "freeze" fusogenic epitopes in accordance 
with the invention. For example, "freezing" can be 
effected by addition of the DP-178 or T-649Q26L 
peptides that represent portions of the coiled coil 

10 region, and that when added to CD4 -triggered 
consensus or ancestral envelope, result in 
prevention of fusion (Rimsky et al, J. Virol. 
72:986-993 (1998)). HR-2 peptide bound consensus or 
ancestral gpl20, gpl40, gp41 or gpl60 can be used as 

15 an immunogen or crossl inked by a reagent such as 
DTSSP or DSP (Pierce Co.), formaldehyde or other 
crosslinking agent that has a similar effect. 

"Freezing" can also be effected by the addition 
of 0.1% to 3% formaldehyde or paraformaldehyde, both 

20 protein cross -linking agents, to the complex, to 

stabilize the CD4 , CCR5 or CXCR4 , HR-2 peptide gpl60 
complex, or to stabilize the "triggered" gp41 
molecule, or both (LaCasse et al, Science 283:357- 
362 (1999) ) . 

25 Further, "freezing" of consensus or ancestral 

gp41 or gpl20 fusion intermediates can be effected 
by addition of heterobif unct ional agents such as DSP 
(dithiobis [succimidylproprionate] ) (Pierce Co. 
Rockford, ILL., No. 22585ZZ) or the water soluble 

30 DTSSP (Pierce Co.) that use two NHS esters that are 
reactive with amino groups to cross link and 
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stabilize the CD4 , CCR5 or CXCR4 , HR-2 peptide gpl60 
complex, or to stabilize the "triggered" gp41 
molecule, or both. 

Analysis of T cell immune responses in 
5 immunized or vaccinated animals and humans shows 
that the envelope protein is normally not a main 
target for T cell immune response although it is the 
only gene that induces neutralizing antibodies. 
HIV-1 Gag, Pol and Nef proteins induce a potent T 

10 cell immune response. Accordingly, the invention 
includes a repertoire of consensus or ancestral 
immunogens that can induce both humoral and cellular 
immune responses. Subunits of consensus or 
ancestral sequences can be used as T or B cell 

15 immunogens. (See Examples 6 and 7, and Figures 
referenced therein, and Figures 63-127. 

The immunogen of the invention can be 
formulated with a pharmaceutical ly acceptable 
carrier and/or adjuvant (such as alum) using 

20 techniques well known in the art. Suitable routes 
of administration of the present immunogen include 
systemic (e.g. intramuscular or subcutaneous). 
Alternative routes can be used when an immune 
response is sought in a mucosal immune system (e.g., 

25 intranasal) . 

The immunogens of the invention can be 
chemically synthesized and purified using methods 
which are well known to the ordinarily skilled 
artisan. The immunogens can also be synthesized by 

30 well-known recombinant DNA techniques. Nucleic 

acids encoding the immunogens of the invention can 
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be used as components of, for example, a DNA vaccine 
wherein the encoding sequence is administered as 
naked DNA or, for example, a minigene encoding the 
immunogen can be present in a viral vector. The 
5 encoding sequence can be present, for example, in a 
replicating or non- replicating adenoviral vector, an 
adeno-associated virus vector, an attenuated 
mycobacterium tuberculosis vector, a Bacillus 
Calmette Guerin <BCG) vector, a vaccinia or Modified 

10 Vaccinia Ankara (MVA) vector, another pox virus 

vector, recombinant polio and other enteric virus 
vector, Salmonella species bacterial vector, 
Shigella species bacterial vector, Venezuelean 
Equine Encephalitis Virus (VEE) vector, a Semliki 

15 Forest Virus vector, or a Tobacco Mosaic Virus 
vector. The encoding sequence, can also be 
expressed as a DNA plasmid with, for example, an 
active promoter such as a CMV promoter. Other live 
vectors can also be used to express the sequences of 

20 the invention. Expression of the immunogen of the 
invention can be induced in a patient's own cells, 
by introduction into those cells of nucleic acids 
that encode the immunogen, preferably using codons 
and promoters that optimize expression in human 

25 cells. Examples of methods of making and using DNA 
vaccines are disclosed in U.S. Pat. Nos. 5,580,859, 
5,589,466, and 5,703,055. 

The composition of the invention comprises an 
immunologically effective amount of the immunogen of 

30 this invention, or nucleic acid sequence encoding 
same, in a pharmaceutically acceptable delivery 
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system. The compositions can be used for prevention 
and/or treatment of immunodef iciency virus 
infection. The compositions of the invention can be 
formulated using adjuvants, emulsifiers, 
5 pharmaceutically-acceptable carriers or other 
ingredients routinely provided in vaccine 
compositions. Optimum formulations can be readily 
designed by one of ordinary skill in the art and can 
include formulations for immediate release and/or 

10 for sustained release, and for induction of systemic 
immunity and/or induction of localized mucosal 
immunity (e.g, the formulation can be designed for 
intranasal administration) . The present 
compositions can be administered by any convenient 

15 route including subcutaneous, intranasal, oral, 

intramuscular, or other parenteral or enteral route. 
The immunogens can be administered as a single dose 
or multiple doses. Optimum immunization schedules 
can be readily determined by the ordinarily skilled 

20 artisan and can vary with the patient, the 
composition and the effect sought. 

The invention contemplates the direct use of 
both the immunogen of the invention and/or nucleic 
acids encoding same and/or the immunogen expressed 

25 as minigenes in the vectors indicated above. For 
example, a minigene encoding the immunogen can be 
used as a prime and/or boost. 

Certain aspects of the invention can be 
described in greater detail in the non- limiting 

30 Examples that follows. 
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EXAMPLE 1 

Artificial HIV-1 Group M Consensus Envelope 
EXPERIMENTAL DETAILS 

5 Expression of CON6 gpl20 and gpl4 0 proteins in 

recombinant vaccinia viruses (W) . To express and 
purify the secreted form of HIV-1 CON6 envelope 
proteins, CON6 gpl2 0 and gpl40CF plasmids were 
constructed by introducing stop codons after the 

10 gpl20 cleavage site (REKR) and before the 

transmembrane domain (YIKIFIMIVGGLIGLRIVFAVLSIVN) , 
respectively. The gpl2 0/gp41 cleavage site and 
fusion domain of gp41 were deleted in the gpl40CF 
protein. Both CON6 gpl20 and gpl40CF DNA constructs 

15 were cloned into the pSC65 vector (from Bernard 
Moss, NIH, Bethesda, MD) at Sail and Kpnl 
restriction enzyme sites. This vector contains the 
lacZ gene that is controlled by the p7 . 5 promoter. 
A back- to-back P E/L promoter was used to express 

20 CON6 env genes. BSC-1 cells were seeded at 2 x 10 5 
in each well in a 6-well plate, infected with wild- 
type vaccinia virus (WR) at a MOI of 0.1 pfu/cell, 
and 2 hr after infection, pSC65 -derived plasmids 
containing CON6 env genes were transfected into the 

25 W- infected cells and recombinant (r) W selected as 
described (Moss and Earl, Current Protocols in 
Molecular Biology, eds, Ausubel et al (John Wiley & 
Sons, Inc. Indianapolis, IN) pp. 16.15.1-16.19.9 
(1998)). Recombinant W that contained the CON6 env 



57 



874388 



genes were confirmed by PCR and sequencing analysis. 
Expression of the CON6 envelope proteins was 
confirmed by SDS-PAGE and Western blot assay. 
Recombinant CON6 gpl20 and gpl40CF were purified 
5 with agarose galanthus Nivalis lectin beads (Vector 
Labs, Burlingame, CA) , and stored at -70°C until use. 
Recombinant W expressing JRFL (vCB-28) or 96ZM651 
(vT241R) gpl60 were obtained from the NIH AIDS 
Research and Reference Reagent Program (Bethesda, 
10 MD) . 

Monoclonal Antibodies and gpl20 Wild-type 
Envelopes. Human mabs against a conformational 
determinant on gpl20 (A32) , the gpl20 V3 loop (F39F) 

15 and the CCR5 binding site (17b) were the gifts of 

James Robinson (Tulane Medical School , New Orleans, 
LA) (Wyatt et al , Nature 393;705-711 (1998), Wyatt 
et al, J. Virol. 69:5723-5733 (1995)). Mabs 2F5, 
447, bl2, 2G12 and soluable CD4 were obtained from 

20 the NIH AIDS Research and Reference Reagent Program 
(Bethesda, MD) (Gorny et al , J. Immunol. 159:5114- 
5122 (1997), Nyambi et al , J . Virol. 70:6235-6243 
(1996), Purtscher et al, AIDS Res. Hum. Retroviruses 
10:1651-1658 (1994), Trkola et al , J. Virol 70:1100- 

25 1108 (1996)). T8 is a murine mab that maps to the 

gpl20 CI region (a gift from P. Earl, NIH, Bethesda, 
MD) . BaL (subtype B) , 96ZM651 (subtype C) , and 
93TH975 (subtype E) gpl20s were provided by QBI , 
Inc. and the Division of AIDS, NIH. CHO cell lines 

30 that express 92U03 7 (subtype A) and 93BR02 9 (subtype 
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F) gpl40 (secreted and uncleaved) were obtained from 
NICBS, England. 

Surface Plasmon Resonance Biosensor (SPR) 
5 Measurements and ELJSA . SPR biosensor measurements 
were determined on a BIAcore 3 000 instrument 
(BIAcore Inc. , Uppsala, Sweden) instrument and data 
analysis was performed using BIAevaluation 3.0 
software (BIAcore Inc, Upsaala, Sweden) . Anti-gpl2 0 

10 mabs (T8, A32, 17b, 2G12) or sCD4 in lOmM Na- acetate 
buffer, pH 4.5 were directly immobilized to a CM5 
sensor chip using a standard amine coupling protocol 
for protein immobilization. FPLC purified CON6 
gpl20 monomer or gpl40CF oligomer recombinant 

is proteins were flowed over CM5 sensor chips at 

concentrations of 100 and 3 00 jxg/ml , respectively. 
A blank in-line reference surface (activated and de- 
activated for amine coupling) or non-bonding mab 
controls were used to subtract non-specific or bulk 

20 responses. Soluble 89.6 gpl20 and irrelevant IgG 
was used as a positive and negative control 
respectively and to ensure activity of each mab 
surface prior to injecting the CON6 Env proteins. 
Binding of CON6 envelope proteins was monitored in 

25 real-time at 25°C with a continuous flow of PBS (150 
mM NaCl, 0.005% surfactant P20), pH 7.4 at 10-30 
/zl/min. Bound proteins were removed and the sensor 
surfaces were regenerated following each cycle of 
binding by single or duplicate 5-10 \xl pulses of 

30 regeneration solution (10 mM glycine-HCl, pH 2.9). 
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ELISA was performed to determine the reactivity of 
various mabs to CON 6 gpl2 0 and gpl4 0CF proteins as 
described (Haynes et al, AIDS Res. Hum. Retroviruses 
11:211-221 (1995)). For assay of human mab binding 
5 to rgpl2 0 or gpl4 0 proteins, end-point titers were 
defined as the highest titer of mab (beginning at 20 
/xg/ml) at which the mab bound CON6 gpl2 0 and gpl40CF 
Env proteins a 3 fold over background control (non- 
binding human mab) . 

10 

Infective ty and coreceptor usage assays. HIV- 
l/SG3Aenv and CON6 or control env plasmids were 
cotransf ected into human 293T cells. Pseudotyped 
viruses were harvested, filtered and p24 

15 concentration was quant itated (DuPont/NEN Life 

Sciences, Boston, MA) . Equal amounts of p24 (5 ng) 
for each pseudovirion were used to infect JC53-BL 
cells to determine the infectivity (Derdeyn e al, J . 
Virol. 74:8358-8367 (2000), Wei et al , Antimicrob 

20 Agents Chemother. 46:1896-1905 (2002)). JC53-BL 
cells express CD4 , CCR5 and CXCR4 receptors and 
contain a P-galactosidase (0-gal) gene stably 
integrated under the transcriptional control of an 
HIV-1 long terminal repeat (LTR) . These cells can 

25 be used to quantify the infectious titers of 

pseudovirion stocks by staining for p-gal expression 
and counting the number of blue cells (infectious 
units) per microgram of p24 of pseudovirons (IU//zg 
p24) (Derdeyn e al, J . Virol. 74:8358-8367 (2000), 

30 Wei et al, Antimicrob Agents Chemother. 46:1896-1905 
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(2002)). To determine the coreceptor usage of the 
CON6 env gene, JC53BL cells were treated with 1.2 /xM 
AMD3100 and 4 TAK-799 for 1 hr at 37°C then 

infected with equal amounts of p24 (5 ng) of each 
S Env pseudotyped virus . The blockage efficiency was 
expressed as the percentage of the infectious units 
from blockage experiments compared to that from 
control culture without blocking agents. The 
infect ivity from control group (no blocking agent) 
10 was arbitrarily set as 100%, 

Iimnuni zations . All animals were housed in the 
Duke University Animal Facility under AALAC 
guidelines with animal use protocols approved by the 

15 Duke University Animal Use and Care Committee. 

Recombinant CON6 gpl2 0 and gpl4 0CF glycoproteins 
were formulated in a stable emulsion with RIBI-CWS 
adjuvant based on the protocol provided by the 
manufacturer (Sigma Chemical Co w St. Louis, MO) . 

20 For induction of anti-envelope antibodies, each of 
four out -bred guinea pigs (Harlan Sprague, Inc., 
Chicago, IL) was given 100 jig either purified CON6 
gpl20 or gpl40CF subcutaneous ly every 3 weeks (total 
of 5 immunizations) . Serum samples were heat- 

25 inactivated (56°C, 1 hr) , and stored at -20°C until 
use . 

For induction of ant i -envelope T cell 
responses, 6-8 wk old female BALB/c mice (Frederick 
Cancer Research and Developmental Center, NCI, 
30 Frederick, MD) were immunized i.m. in the quadriceps 
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with 50 /zg plasmid DNA three times at a 3 -week 
interval . Three weeks after the last DNA 
immunization, mice were boosted with 10 7 PFU of rW 
expressing Env proteins. Two weeks after the boost, 
5 all mice were euthanized and spleens were removed 
for isolation of splenocytes. 

Neutralization assays. Neutralization assays 
were performed using either a MT-2 assay as 

10 described in Bures et al, AIDS Res. Hum. 

Retroviruses 16:2019-2035 (2000) , a lucif erase-based 
multiple replication cycle HIV-l infectivity assay 
in 5.25.GFP.Luc.M7 cells using a panel of HIV-l 
primary isolates (Bures et al, AIDS Res. Hum. 

15 Retroviruses 16:2019-2035 (2000), Bures et al , J. 

Virol. 76:2233-2244 (2002)), or a syncytium (fusion 
from without) inhibition assay using inactivated 
HIV-l virions (Rossio et al, J . Virol. 72:7992-8001 
(1998) ) . In the lucif erase-based assay, 

20 neutralizing antibodies were measured as a function 
of a reduction in luciferase acitivity in 
5.25.EGFP.Luc.M7 cells provided by Nathaniel R. 
Landau, Salk Institute, La Jolla, CA (Brandt et al, 
J. Biol. Chem. 277:17291-17299 (2002)). Five 

25 hundred tissue culture infectious dose 50 (TCID 50 ) of 
cell -free virus was incubated with indicated serum 
dilutions in 150 jxl (1 hr, at 37°C) in triplicate in 
96-well flat-bottom culture plates. The 
5 .25 .EGFP.Luc.M7 cells were suspended at a density 

30 of 5 x 10 s /ml in media containing DEAE dextran (10 
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Hg/ml) . Cells (100 p.1) were added and until 10% of 
cells in control wells (no test serum sample) were 
positive for GFP expression by fluorescence 
microscopy. At this time the cells were 
5 concentrated 2- fold by removing one -half volume of 
media. A 50 |il suspension of cells was transferred 
to 96-well white solid plates (Costar, Cambridge, 
MA) for measurement of luciferase activity using 
Bright -Glo™ substrate (Promega, Madison, WI) on a 

10 Wallac 1420 Multilabel Counter (PerkinElmer Life 

Sciences, Boston, MA) . Neutralization titers in the 
MT-2 and luciferase assays were those where > 50% 
virus infection was inhibited. Only values that 
titered beyond 1:20 (i.e. >1:30) were considered 

15 significantly positive. The syncytium inhibition 
"fusion from without" assay utilized HIV-1 
aldrithiol-2 (AT-2) inactivated virions from HIV-1 
subtype B strains ADA and AD8 (the gift of Larry 
Arthur and Jeffrey Lifson, Frederick Research Cancer 

20 Facility, Frederick, MD) added to SupTl cells, with 
syncytium inhibition titers determined as those 
titers where ^>90% of syncytia were inhibited 
compared to prebleed sera. 

25 Enzyme linked immune spot (ELISPOT) assay. 

Single-cell suspensions of splenocytes from 
individual immunized mice were prepared by mincing 
and forcing through a 70 \xm Nylon cell strainer (BD 
Labware, Franklin Lakes, NJ) . Overlapping Env 

3 0 peptides of CON6 gpl4 0 (159 peptides, 15mers 
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overlapping by 11) were purchased from Boston 
Bioscence, Inc (Royal Oak, MI) . Overlapping Env 
peptides of MN gpl40 (subtype B; 170 peptides, 
15mers overlapping by 11) and Chnl9 gpl40 (subtype 
5 C; 69 peptides, 20mers overlapping by 10) were 

obtained from the NIH AIDS Research and Reference 
Reagent Program (Bethesda, MD) . Splenocytes (5 
mice/group) from each mouse were stimulated in vitro 
with overlapping Env peptides pools from CON6, 

10 subtype B and subtype C Env proteins. 96 -well PVDF 
plates (Multiscreen- IP, Millipore, Billerica, MA) 
were coated with anti-IFN-y mab (5 ng/ml, AN18; 
Mabtech, Stockholm, Sweden) . After the plates were 
blocked at 37 *C for 2 hr using complete Hepes 

is buffered RPMI medium, 50^1 of the pooled overlapping 
envelope peptides (13 CON6 and MN pools, 13-14 
peptides in each pool; 9 Chnl9 pool, 7-8 peptide in 
each pool) at a final concentration of 5 ptg/ml of 
each were added to the plate. Then 50 /zl of 

20 splenocytes at a concentration of 1.0 X 10 7 /ml were 
added to the wells in duplicate and incubated for 16 
hr at 37 *C with 5% C0 2 . The plates were incubated 
with 100 pi of a 1:1000 dilution of streptavidin 
alkaline phosphatase (Mabtech, Stockholm, Sweden) , 

25 and purple spots developed using 100 /xl of BCIP/NBT 
(Plus) Alkaline Phosphatase Substrate (Moss, 
Pasadena, MD) . Spot forming cells (SFC) were 
measured using an Immunospot counting system (CTL 
Analyzers, Cleveland, OH) . Total responses for each 
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envelope peptide pool are expressed as SFCs per 10* 
splenocytes . 

RESULTS 

5 

CON6 Envelope Gene Design, Construction and 
Expression. An artificial group M consensus env 
gene (CON6) was constructed by generating consensus 
sequences of env genes for each HIV-1 subtype from 

10 sequences in the Los Alamos HIV Sequence Database, 
and then generating a consensus sequence of all 
subtype consensuses to avoid heavily sequenced 
subtypes (Gaschen et al, Science 296:2354-2360 
(2002), Korber et al, Science 288:1789-1796 (2000)). 

15 Five highly variable regions from a CRF08_BC 

recombinant strain (98CN006) (VI, V2 , V4 , V5 and a 
region in cytoplasmic domain of gp41) were then used 
to fill in the missing regions in CON6 sequence. 
The CON6 V3 region is group M consensus (Figure 1A) . 

20 For high levels of expression, the codons of CON6 
env gene were optimized based on codon usage for 
highly expressed human genes (Haas et al , Curr. 
Biol. 6:315-324 (2000), Andre et al, J. Virol. 
72:1497-1503 (1998)). (See Fig. ID.) The codon 

25 optimized CON6 env gene was constructed and 

subcloned into pcDNA3 . 1 DNA at EcoR I and BamH I 
sites (Gao et al, AIDS Res. Hum. Retroviruses, 
19:817-823 (2003)). High levels of protein 
expression were confirmed with Western-blot assays 

30 after transfection into 293T cells. To obtain 

recombinant CON6 Env proteins for characterization 
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and use as immunogens , rW was generated to express 
secreted gpl20 and uncleaved gpl40CF (Figure IB) . 
Purity for each protein was _>90% as determined by 
Coomassie blue gels under reducing conditions 
5 (Figure 1C) . 

CD4 Binding Domain and Other Wild- type HTV-1 
Epitopes are Preserved on CON6 Proteins . To 
determine if CON6 proteins can bind to CD4 .and 

10 express other wild-type HIV-1 epitopes, the ability 
of CON6 gpl20 and gpl40CF to bind soluble (s) CD4 , to 
bind several well -characterized anti-gpl2 0 mabs, and 
to undergo CD4- induced conformational changes was 
assayed. First, BIAcore CM5 sensor chips were 

15 coated with either sCD4 or mabs to monitor their 

binding activity to CON6 Env proteins. It was found 
that both monomer ic CON6 gpl2 0 and oligomer ic 
gpl40CF efficiently bound sCD4 and anti-gpl20 mabs 
T8, 2G12 and A32, but did not constitut ively bind 

20 mab 17b, that recognizes a CD4 inducible epitope in 
the CCR5 binding site of gpl20 (Figures 2A and 2B) . 
Both sCD4 and A32 can expose the 17b binding epitope 
after binding to wild-type gpl20 (Wyatt et al , 
Nature 393/705-711 (1998), Wyatt et al , J. Virol. 

25 69:5723-5733 (1995)). To determine if the 17b 

epitope could be induced on CON6 Envs by either sCD4 
or A3 2, sCD4, A3 2 and T8 were coated on sensor 
chips, then CON6 gpl20 or gpl40CF captured, and mab 
17b binding activity monitored. After binding sCD4 

30 or mab A32, both CON6 gpl20 and gpl40CF were 

triggered to undergo conformational changes and 

66 

874388 



bound mab 17b (Figures 2C and 2D) . In contrast, 
after binding mab T8, the 17b epitope was not 
exposed (Figures 2C and 2D) . ELISA was next used to 
determine the reactivity of a panel of human mabs 
5 against the gpl2 0 V3 loop (44 7, F3 9F) , the CD4 
binding site (bl2) , and the gp41 neutralizing 
determinant (2F5) to CON6 gpl20 and gpl40CF (Figure 
2E) . Both CON6 rgpl2 0 and rgpl4 0CF proteins bound 
well to neutralizing V3 mabs 44 7 and F3 9F and to the 
io potent neutralizing CD4 binding site mab bl2 . Mab 
2F5, that neutralizes HIV-1 primary isolates by 
binding to a C- terminal gp41 epitope, also bound 
well to CON6 gpl40CF (Figure 2E) . 

is CON6 env Gene is Biologically Functional and 

Uses CCR5 as its Coreceptor. To determine whether 
CON6 envelope gene is biologically functional, it 
was co-transf ected with the env-def ective SG3 
proviral clone into 293T cells. The pseudotyped 

20 viruses were harvested and »JC53BL cells infected. 
Blue cells were detected in vJC53-BL cells infected 
with the CON6 Env pseudovirions, suggesting that 
CON6 Env protein is biologically functional (Figure 
3A) . However, the infectious titers were 1-2 logs 

25 lower than that of pseudovirions with either YU2 or 
NL4-3 wild- type HIV-1 envelopes. 

The co- receptor usage for the CON6 env gene was 
next determined. When treated with CXCR4 blocking 
agent AMD3100, the infectivity of NL4-3 Env- 

30 pseudovirons was blocked while the infectivity of 
YU2 or CON6 Env -pseudovirons was not inhibited 
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(Figure 3B) . In contrast, when treated with CCR5 
blocking agent TAK-779, the infectivity of NL4-3 
Env-pseudovirons was not affected, while the 
infectivity of YU2 or CON6 Env-pseudovirons was 
5 inhibited- When treated with both blocking agents, 
the infectivity of all pseudovirions was inhibited. 
Taken together, these data show that the CON6 
envelope uses the CCR5 co- receptor for its entry 
into target cells. 

10 

Reaction of CON6 gpl20 With Different Subtype 
Sera. To determine if multiple subtype linear 
epitopes are preserved on CON6 gpl2 0, a recombinant 
Env protein panel (gpl2 0 and gpl4 0) was generated. 

15 Equal amounts of each Env protein (100 ng) were 

loaded on SDS-polyacryl amide gels, transferred to 
nitrocellulose, and reacted with subtype A through G 
patient sera as well as anti-CON6 gpl20 guinea pig 
sera (1:1,000 dilution) in Western blot assays. For 

20 each HIV-1 subtype, four to six patient sera were 
tested. One serum representative for each subtype 
is shown in Figure 4 . 

It was found that whereas all subtype sera 
tested showed variable reactivities among Envs in 

25 the panel, all group M subtype patient sera reacted 
equally well with CON6 gpl20 Env protein, 
demonstrating that wild-type HIV-1 Env epitopes 
recognized by patient sera were well preserved on 
the CON6 Env protein. A test was next made as to 

30 whether CON6 gpl20 antiserum raised in guinea pigs 
could react to different subtype Env proteins. It 



68 



874388 



was found that the CON6 serum reacted to its own and 
other subtype Env proteins equally well, with the 
exception of subtype A Env protein (Figure 4) . 



5 Induction of T Cell Responses to CON6, Subtype 

B and Subtype C Envelope Overlapping Peptides. To 
compare T cell immune responses induced by CON6 Env 
immunogens with those induced by subtype specific 
immunogens , two additional groups of mice were 

10 immunized with subtype B or subtype C DNAs and with 
corresponding rW expressing subtype B or C envelope 
proteins. Mice immunized with subtype B ( CFRFL) or 
subtype C (96ZM651) Env immunogen had primarily 
subtype- specif ic T cell immune responses (Figure 5) . 

15 IFN-y SFCs from mice immunized with JT^FL (subtype B) 
immunogen were detected after stimulation with 
subtype B (MN) peptide pools, but not with either 
subtype C (Chnl9) or CON6 peptide pools. IFN-y SFCs 
from mice immunized with 96ZM651 (subtype C) 

20 immunogen were detected after the stimulation with 
both subtype C (Chnl9) and CON6 peptide pools, but 
not with subtype B (MN) peptide pools. In contrast, 
IFN-y SFCs were identified from mice immunized with 
CON6 Env immunogens when stimulated with either CON6 

25 peptide pools as well as by subtype B or C peptide 
pools (Figure 5) . The T cell immune responses 
induced by CON6 gpl4 0 appeared more robust than 
those induced by CON6 gpl20. Taken together, these 
data demonstrated that CON6 gpl20 and gpl40CF 

30 immunogens were capable of inducing T cell responses 
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that recognized T cell epitopes of wild-type subtype 
B and C envelopes - 

Induction of Antibodies by Recombinant CON6 
5 gpl20 and gp!40CF Envelopes that Neutralize HIV-1 
Subtype B and C Primary Isolates. To determine if 
the CON6 envelope immunogens can induce antibodies 
that neutralize HIV-1 primary isolates, guinea pigs 
were immunized with either CON6 gpl20 or gp!40CF 

10 protein. Sera collected after 4 or 5 immunizations 
were used for neutralization assays and compared to 
the corresponding prebleed sera. Two AT- 2 
inactivated HIV-1 isolates (ADA and AD8) were tested 
in syncytium inhibition assays (Table 5A) . Two 

15 subtype B SHIV isolates, eight subtype B primary 

isolates, four subtype C, and one each subtype A, D, 
and E primary isolates were tested in either the MT- 
2 or the lucif erase-based assay (Table 5B) . In the 
syncytium inhibition assay, it was found that 

20 antibodies induced by both CON 6 gpl2 0 and gpl4 0CF 

proteins strongly inhibited AT-2 inactivated ADA and 
AD8- induced syncytia (Table 5A) . In the MT-2 assay, 
weak neutralization of 1 of 2 SHIV isolates (SHIV 
SF162P3) by two gpl20 and one gpl4 0CF sera was found 

25 (Table 5B) . In the lucif erase-based assay, strong 
neutralization of 4 of 8 subtype B primary isolates 
(BX08, SF162, SS1196, and BAL) by all gpl20 and 
gpl40CF sera was found, and weak neutralization of 2 
of 8 subtype B isolates (6101, 0692) by most gpl20 

30 and gpl40CF sera was found. No neutralization was 
detected against HIV-1 PAVO (Table 5B) . Next, the 
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CON6 anti-gpl20 and gpl4 0CF sera were tested against 
four subtype C HIV-1 isolates, and weak 
neutralization of 3 of 4 isolates (DU179, DU368, and 
S080) was found, primarily by anti-CON6 gpl20 sera. 
5 One gp!4 0CF serum, no. 653, strongly neutralized 
DU179 and weakly neutralized SO 80 (Table 5B) . 



Finally, anti-CON6 Env sera strongly neutralized a 



subtype D isolate 


(93ZR001) 


, weakly neutralized a 




subtype E (CM244) 


isolate, 


and did not neutralize 


a 


subtype A (92RW020) isolate 










Table 5A 




Ability of HIV-1 Group M Consensus Envelope CON6 Proteins to Induce 




Fusion Inhibiting Antibodies 








Syncytium Inhibition antibody titer 1 


Guinea Pig No. 


Immunogen 


AD8 


ADA 


646 


gp120 


270 


270 


647 


gp120 


90 


90 


648 


gp120 


90 


270 


649 


gp120 


90 


90 


Geometric Mean Titer 




119 


156 


650 


gpl40 


270 


270 


651 


gpl40 


90 


90 


652 


gp140 


SB10 


810 


653 


gpl40 


270 


90 


Geometric Mean Titer 




270 


207 



Reciprocal serum dilution at which HIV-induced syncytia of Sup T1 cells was 
inhibited by >90% compared to pre-immune serum. All prebleed sera were negative 
15 (titer <10). 
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CONCLUSIONS 

The production of an artificial HIV-1 Group M 
consensus env genes (encoding sequences) (CON6 and 
5 Con-S) have been described that encodes a functional 
Env protein that is capable of utilizing the CCR5 
co-receptor for mediating viral entry. Importantly, 
these Group M consensus envelope genes could induce 
T and B cell responses that recognized epitopes of 
10 subtype B and C HIV-1 primary isolates. In 

addition, Con-S induces antibodies that strongly 
neutralize Subtype -C and A HIV-1 strains (see 
Table 3) . 

The correlates of protection to HIV-1 are not 
15 conclusively known. Considerable data from animal 
models and studies in HIV-l-inf ected patients 
suggest the goal of HIV-1 vaccine development should 
be the induction of broadly- reactive CD4 + and CD8+ 
anti -HIV-1 T cell responses (Letvin et al, Annu. 
20 Rev. Immunol. 20:73-99 (2002)) and high levels of 

antibodies that neutralize HIV-1 primary isolates of 
multiple subtypes (Mascola et al , J. Virol. 73:4 009- 
4018 (1999), Mascola et al, Nat. Med. 6:270-210 
(2000) ) . 

25 The high level of genetic variability of HIV-1 

has made it difficult to design immunogens capable 
of inducing immune responses of sufficient breadth 
to be clinically useful. Epitope based vaccines for 
T and B cell responses (McMichael et al, Vaccine 

30 20:1918-1921 (2002), Sbai et al , Curr. Drug Targets 
Infect, Disord. 1:303-313 (2001), Haynes , Lancet 
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348:933-937 (1996)), constrained envelopes 
reflective of fusion intermediates (Fouts et al, 
Proc. Natl. Acad. Sci . USA 99:11842-22847 (2002)), 
as well as exposure of conserved high-order 
5 structures for induct ion of ant i -HIV - 1 neutralizing 
antibodies have been proposed to overcome HIV-1 
variability (Roben et al , J. Virol. 68:4821-4828 
(1994), Saphire et al, Science 293:1155-1159 
(2 001) ) . However, with the ever- increasing 

10 diversity and rapid evolution of HIV-1, the virus is 
a rapidly moving complex target, and the extent of 
complexity of HIV-1 variation makes all of these 
approaches problemat ic . The current most common 
approach to HIV-1 immunogen design is to choose a 

is wild- type field HIV-1 isolate that may or may not be 
from the region in which the vaccine is to be 
tested. Polyvalent envelope immunogens have been 
designed incorporating multiple envelope immunogens 
(Bartlett et al , AIDS 12:1291-1300 (1998), Cho et 

20 al, jr. Virol. 75:2224-2234 (2001)). 

The above-described study tests a new strategy 
for HIV-1 immunogen design by generating a group M 
consensus env gene (CON6) with decreased genetic 
distance between this candidate immunogen and wild- 

25 type field virus strains. The CON6 env gene was 
generated for all subtypes by choosing the most 
common amino acids at most positions (Gaschen et al, 
Science 296:2354-2360 (2002), Korber et al , Science 
288:1789-1796 (2000)). Since only the most common 

30 amino acids were used, the majority of antibody and 
T cell epitopes were well preserved. Importantly, 
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the genetic distances between the group M consensus 
env sequence and any subtype env sequences was about 
15% , which is only half of that between wild-type 
subtypes (30%) (Gaschen et al, Science 296:2354-2360 
5 (2002)). This distance is approximately the same as 
that among viruses within the same subtype. 
Further, the group M consensus env gene was also 
about 15% divergent from any recombinant viral env 
gene, as well, since CRFs do not increase the 

10 overall genetic divergence among subtypes . 

Infect ivity of CON6-Env pseudovirions was 
confirmed using a single- round infection system, 
although the infectivity was compromised, indicating 
the artificial envelope was not in an "optimal" 

15 functional conformation, but yet was able to mediate 
virus entry. That the CON6 envelope used CCR5 (R5) 
as its coreceptor is important, since majority of 
HIV-1 infected patients are initially infected with 
R5 viruses. 

20 BIAcore analysis showed that both CON6 gpl2 0 

and gpl40CF bound sCD4 and a number of mabs that 
bind to wild-type HIV-1 Env proteins. The 
expression of the CON6 gpl2 0 and 14 0CF proteins that 
are similar antigenically to wild-type HIV-1 

25 envelopes is an important step in HIV-1 immunogen 
development. However, many wild-type envelope 
proteins express the epitopes to which potent 
neutralizing human mabs bind, yet when used as 
immunogens themselves, do not induce broadly 

30 neutralizing anti -HIV-1 antibodies of the 

specificity of the neutralizing human mabs. 
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The neutralizing antibody studies were 
encouraging in that both CON6 gpl20, CON6 gpl40CF 
and Con-S gpl4 0CFI induced antibodies that 
neutralized select subtype B, C and D HIV-l primary 
5 isolates, with Con-S gpl40CFI inducing the most 

robust neutralization of non- subtype B primary HIV 
isolates. However, it is clear that the most 
difficult-to-neutralize primary isolates (PAVO, 
6101, BG1168, 92RW020, CM244) were either only 

10 weakly or not neutralized by anti-CON6 gpl20 or 
gpl4 0 sera (Table 4b) . Nonetheless, the Con-S 
envelope immunogenic ity for induction of 
neutralizing antibodies is promising, given the 
breadth of responses generated with the Con-S 

is subunit gpl4 0CFI envelope protein for non- subtype B 
HIV isolates. Previous studies with poxvirus 
constructs expressing gpl20 and gpl60 have not 
generated high levels of neutralizing antibodies 
(Evans et al, J. Infect- Dis. 180:290-298 (1999), 

20 Polacino et al, J . Virol. 73:618-630 (1999), 

Ourmanov et al , J . Virol. 74:2960-2965 (2000), Pal 
et al, J. Virol 76:292-302 (2002), Excler and 
Plotkin, AIDS ll(Suppl A) : S127-137 (1997). rW 
expressing secreted CON6 gpl20 and gpl4 0 have been 

25 constructed and antibodies that neutralize HIV-1 
primary isolates induced. An HIV neutralizing 
antibody immunogen can be a combination of Con-S 
gp!4 0CFI / or subunit thereof, with immunogens that 
neutralize most subtype B isolates. 
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The structure of an oligomeric gpl4 0 protein is 
critical when evaluating protein immunogenicity. In 
this regard, study of purified CON6 gpl4 0CF proteins 
by fast performance liquid chromatography (FPLC) and 
5 analytical ultracentrif iguration has demonstrated 
that the purified gpl40 peak consists predominantly 
of trimers with a small component of dimers . 

Thus, centralized envelopes such as CON6 , Con-S 
or 2003 group M or subtype consensus or ancestral 

10 encoding sequences described herein, are attractive 
candidates for preparation of various potentially 
"enhanced" envelope immunogens including CD4-Env 
complexes, constrained envelope structures, and 
trimeric oligomeric forms. The ability of CON6- 

15 induced T and B cell responses to protect against 
HIV-1 infection and/or disease in SHIV challenge 
models will be studied in non-human primates. 

The above study has demonstrated that 
artificial centralized HIV-1 genes such as group M 

20 consensus env gene (CON6) and Con-S can also induce 
T cell responses to T cell epitopes in wild-type 
subtype B and C Env proteins as well as to those on 
group M consensus Env proteins (Figure 5) . While 
the DNA prime and rW boost regimen with CON6 

25 gpl40CF immunogen clearly induced IFN-y producing T 
cells that recognized subtype B and C epitopes, 
further studies are needed to determine if 
centralized sequences such as are found in the CON6 
envelope are significantly better at inducing cross- 

30 clade T cell responses than wild-type HIV-1 genes 
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(Ferrari et al , Proc . Natl. Acad. Sci. USA 94:1396- 
1401 (1997), Ferrari et al, AIDS Res. Hum. 
Retroviruses 16:1433-1443 (2000)). However, the 
fact that CON6 (and Con-S env encoding sequence) 
5 prime and boosted splenocyte T cells recognized HIV- 
1 subtype B and C T cell epitopes is an important 
step in demonstration that CON6 (and Con-S) can 
induce T cell responses that might be clinically 
useful . 

io Three computer models (consensus, ancestor and 

center of the tree (COT) ) have been proposed to 
generate centralized HIV-1 genes (Gaschen et al , 
Science 296:2354-2360 (2002), Gao et al , Science 
299:1517-1518 (2003), Nickle et al , Science 

15 299:1515-1517 (2003), Korber et al, Science 

288:1789-1796 (2000). They all tend to locate at 
the roots of the star-like phylogenetic trees for 
most HIV-1 sequences within or between subtypes. As 
experimental vaccines, they all can reduce the 

20 genetic distances between immunogens and field virus 
strains. However, consensus, ancestral and COT 
sequences each have advantages and disadvantages 
(Gaschen et al , Science 296:2354-2360 (2002), Gao et 
al, Science 299:1517-1518 (2003), Nickle et al , 

25 Science 299:1515-1517 (2003). Consensus and COT 
represent the sequences or epitopes in sampled 
current wild-type viruses and are less affected by 
outliers HIV-1 sequences, while ancestor represents 
ancestral sequences that can be significantly 

30 affected by outlier sequences. However, at present, 
it is not known which centralized sequence can serve 
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as the best immunogen to elicit broad immune 
responses against diverse HIV-1 strains, and studies 
are in progress to test these different strategies. 
Taken together, the data have shown that the 
5 HIV-1 artificial CON6 and Con-S envelope can induce 
T cell responses to wild- type HIV-1 epitopes, and 
can induce antibodies that neutralize HIV-1 primary 
isolates, thus demonstrating the feasibility and 
promise of using artificial centralized HIV-1 
10 sequences in HIV-1 vaccine design. 

EXAMPLE 2 

HIV-1 Subtype C Ancestral and Consensus Envelope 

Glycoproteins 

15 EXPERIMENTAL DETAILS 

HIV-1 subtype C ancestral and consensus env 
genes were obtained from the Los Alamos HIV 
Molecular Immunology Database (http://hiv- 
web.lanl.gov/immunology), codon-usage optimized for 

20 mammalian cell expression, and synthesized (Fig. 6) . 
To ensure optimal expression, a Kozak sequence 
(GCCGCCGCC) was inserted immediately upstream of the 
initiation codon. In addition to the full-length 
genes, two truncated env genes were generated by 

25 introducing stop codons immediately after the gp41 
membrane -spanning domain (IVNR) and the gpl20/gp41 
cleavage site (REKR) , generating gpl4 0 and gpl20 
form of the glycoproteins, respectively (Fig. 8) . 
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Genes were tested for integrity in an in vitro 
transcription/translation system and expressed in 
mammalian cells. To determine if the ancestral and 
consensus subtype C envelopes were capable of 
5 mediating fusion and entry, gpl60 and gpl40 genes 
were co- transf ected with an HIV-l/SG3Aenv provirus 
and the resulting pseudovirions tested for 
infectivity using the JC53-BL cell assay (Fig. 7) . 
Co -receptor usage and envelope neutralization 
10 sensitivity were also determined with slight 

modifications of the JC53-BL assay. Codon-usage 
optimized and rev -dependent 96ZAM651 env genes were 
used as contemporary subtype C controls. 

RESULTS 

15 

Codon- optimized subtype C ancestral and 
consensus envelope genes {gpl60, gpl40, gpl20) 
express high levels of env glycoprotein in mammalian 
cells (Fig. 9) . 

20 Codon- optimized subtype C gp!60 and gpl4 0 

glycoproteins are efficiently incorporated into 
virus particles. Western Blot analysis of sucrose- 
purified pseudovirions reveals ten-fold higher 
levels of virion incorporation of the codon- 

25 optimized envelopes compared to that of a rev- 
dependent contemporary envelope controls (Fig. 10A) . 

Virions pseudotyped with either the subtype C 
consensus gpl60 or gpl4 0 envelope were more 
infectious than pseudovirions containing the 

30 corresponding gpl60 and gpl4 0 ancestral envelopes. 
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Additionally, gpl60 envelopes were consistently more 
infectious than their respective cfpl4 0 counterparts 
(Fig. 10B) . 

Both subtype C ancestral and consensus 
envelopes utilize CCR5 as a co-receptor to mediate 
virus entry (Fig. 11). 

The infectivity of subtype C ancestral and 
consensus gpl60 containing pseudovirions was 
neutralized by plasma from subtype C infected 
patients. This suggests that these artificial 
envelopes possess a structure that is similar to 
that of native HIV-1 env glycoproteins and that 
common neutralization epitopes are conserved. No 
significant differences in neutralization potential 
were noted between subtype C ancestral and consensus 
env glycoproteins (gpl60) (Fig. 12) . 

CONCLUSIONS 

HIV-1 subtype C viruses are among the most 
prevalent circulating isolates, representing 
approximately fifty percent of new infections 
worldwide. Genetic diversity among globally 
circulating HIV-1 strains poses a challenge for 
vaccine design. Although HIV-1 Env protein is highly 
variable, it can induce both humoral and cellular 
immune responses in the infected host . By analyzing 
70 HIV-1 complete subtype C env sequences, consensus 
and ancestral subtype C env genes have been 
generated. Both sequences are roughly equidistant 
from contemporary subtype C strains and thus 
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expected to induce better* cross-protective immunity. 
A reconstructed ancestral or consensus sequence 
derived- immunog en minimizes the extent of genetic 
differences between the vaccine candidate and 
contemporary isolates. However, consensus and 
ancestral subtype C env genes differ by 5% amino 
acid sequences . Both consensus and ancestral 
sequences have been synthesized for analyses. 
Codon- optimized subtype C ancestral and consensus 
envelope genes have been constructed and the in 
vitro biological properties of the expressed 
glycoproteins determined. Synthetic subtype C 
consensus and ancestral env genes express 
glycoproteins that are similar in their structure, 
function and antigenicity to contemporary subtype C 
wild-type envelope glycoproteins. 



EXAMPLE 3 

Codon-Usage Optimization of Consensus of Subtype C 
gragr and nef Genes (C. con. gag and C.con.nef) 



Subtype C viruses have become the most 
prevalent viruses among all subtypes of Group M 
viruses in the world. More than 50% of HIV-1 
infected people are currently carrying HIV-1 subtype 
C viruses. In addition, there is considerable 
intra-subtype C variability: different subtype C 
viruses can differ by as much as 10% , 6%, 17% and 
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16% of their Gag, Pol, Env and Nef proteins, 
respectively. Most importantly, the subtype C 
viruses from one country can vary as much as the 
viruses isolated from other parts of the world. The 
5 only exceptions are HIV-1 strains from India/China, 
Brazil and Ethiopia/Djibouti where subtype C appears 
to have been introduced more recently. Due to the 
high genetic variability of subtype C viruses even 
within a single country, an immunogen based on a 
10 single virus isolate may not elicit protective 

immunity against other isolates circulating in the 
same area . 

Thus gragr and nef gene sequences of subtype C 
viruses were gathered to generate consensus 

« 

15 sequences for both genes by using a 50% consensus 

threshold. To avoid a potential bias toward founder 
viruses, only one sequence was used from 
India/China, Brazil and Ethiopia/Djibouti, 
respectively, to generate the subtype C consensus 

20 sequences (C. con. gag and C. con. nef). The codons of 
both C. con. gag and C. con. nef genes were optimized 
based on the codon usage of highly expressed human 
genes. The protein expression following transfection 
into 293T cells is shown in Figure 13 . As can be 

25 seen, both consensus subtype C Gag and Nef proteins 
were expressed efficiently and recognized by Gag- 
and Nef -specific antibodies. The protein expression 
levels of both C. con. gag and C. con. nef genes are 
comparible to that of native subtype env gene 

30 (96ZM651) . 
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EXAMPLE 4 



Synthesis of a Full Length "Consensus of the 
Consensus env Gene with Consensus Variable Regions" 
5 (CON-S) 



In the synthesized "consensus of the consensus" 
env gene (CON6) , the variable regions were replaced 
with the corresponding regions from a contemporary 

10 subtype C virus (98CN006) . A further con/con gene 
has been designed that also has consensus variable 
regions (CON-s) . The codons of the Con-S env gene 
were optimized based on the codon usage of highly 
expressed human genes. (See Figs. 14A and 14B for 

is amino acid sequences and nucleic acid sequences, 
respectfully. ) 

Paired oligonucleotides (80-mers) which overlap 
by 20 bp at their 3' ends and contain invariant 
sequences at their 5' and 3' ends, including the 

20 restriction enzyme sites EcoRI and Bbsl as well as 
BsmBI and BamHI , respectively, were designed. Bbsl 
and BamHI are Type II restriction enzymes that 
cleave outside of their recognition sequences. They 
have been positioned in the oligomers in such a way 

25 that they cleave the first four resides adjacent to 
the 18 bp invariant region, leaving 4 base 5' 
overhangs at the end of each fragment for the 
following ligation step. 26 paired oligomers were 
linked individually using PCR and primers 

30 complimentary to the 18 bp invariant sequences. 
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Each pair was cloned into pGEM-T (Promega) using the 
T/A cloning method and sequenced to confirm the 
absence of inadvertent mutations/deletions. pGEM-T 
subclones containing the proper inserts were then 
5 digested, run on a 1% agarose gel, and gel purified 
(Qiagen) . Four individual 108-mers were ligated 
into pcDNA3 . 1 (Invitrogen) in a multi-fragment 
ligation reaction. The four-way ligations occurred 
among groups of fragments, in a stepwise manner from 

io the 5' to the 3' end of the gene. This process was 
repeated until the entire gene was reconstructed in 
the pcDNA3 . 1 vector. 

A complete Con-S gene was constructed by 
ligating the codon usage optimized oligo pairs 

15 together • To confirm its open reading frame, an in 
vitro transcription and translation assay was 

performed. Protein products were labeled by S 35 - 
methionine during the translation step, separated on 
a 10% SDS-PAGE, and detected by radioautography . 

20 Expected size of the expressed Con-S gpl60 was 
identified in 4 out of 7 clones (Fig. 14C) - 

CONs Env protein expression in the mammalian 
cells after transfected into 293T cells using a 
Western blot assay (Figure 15) . The expression level 

25 of Con-S Env protein is very similar to what was 
observed from the previous CON6 env clone that 
contains the consensus conservative regions and 
variable loops from 98CN006 virus isolate. 
The Env - pseudovi rons was produced by 

30 cotransf ecting Con-S env clone and env- deficient SG3 
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proviral clone into 2 93T cells. Two days after 
transfection, the pseudovirions were harvested and 
infected into JC53BL-13 cells. The infectious units 
(IU) were determined by counting the blue cells 
5 after staining with X-gal in three independent 

experiments. When compared with CON6 env clone, Con- 
S env clones produce similar number of IU in JC53BL- 
13 cells (Figure 16) . The IU titers for both are 
about 3 log higher than the SG3 backbone clone 

10 control (No Env) . However, the titers are also 
about 2 log lower than the positive control (the 
native HIV-1 env gene, NL.4-3 or YU2 ) . These data 
suggest that both consensus group M env clones are 
biologically functional. Their functionality, 

15 however, has been compromised. The functional 

consensus env genes indicate that these Env proteins 
fold correctly, preserve the basic conformation of 
the native Env proteins, and are able to be 
developed as universal Env immunogens . 

20 It was next determined what coreceptor Con-S 

Env uses for its entry into JC53-BL cells. When 
treated with CXCR4 blocking agent AMD3100, the 
infectivity of NL4-3 Env-pseudovirons was blocked 
while the infectivity of YU2 , Con-S or CON6 Env- 

25 pseudovirons was not inhibited. In contrast, when 
treated with CCR5 blocking agent TAK77 9, the 
infectivity of NL4 - 3 Env-pseudovirons was not 
affected, while the infectivity of YU2 , Con-S or 
CON6 Env-pseudovirons was inhibited. When treated 

30 with both blocking agents, the infectivity of all 
pseudovirions was inhibited. Taken together, these 
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data show that the Con-S as well as CON6 envelope 
uses the CCR5 but not CXCR4 co-receptor for its 
entry into target cells. 

It was next determined whether CON6 or Con-S 
5 Env proteins could be equally efficiently 

incorporated in to the pseudovirions . To be able 
precisely compare how much Env proteins were 
incorporated into the pseudovirions, each 
pseudovirions is loaded on SDS-PAGE at the same 

10 concentraion: S\xg total protein for cell lysate, 

25ng p24 for cell culture supernatant, or 150ng p24 
for purified virus stock (concentrated pseudovirions 
after super-speed centrifugation) . There was no 
difference in amounts of Env proteins incorporated 

15 in CON6 or Con-S Env- pseudovirions in any 

preparations (cell lysate, cell culture supernatant 
or purified virus stock) (Figure 17) . 

EXAMPLE 5 

Synthesis of a Consensus Subtype A Full Length env 
20 (A. con. env) Gene 

Subtype A viruses are the second most prevalent 
HIV-l in the African continent where over 70% of 
HIV-l infections have been documented. Consensus 

25 gragr, env and nef genes for subtype C viruses that 

are the most prevalent viruses in Africa and in the 
world were previously generated. Since genetic 
distances between subtype A and C viruses are as 
high as 3 0% in the env gene, the cross reactivity or 

30 protection between both subtypes will not be 
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optimal. Two group M consensus env genes for all 
subtypes were also generated. However, to target 
any particular subtype viruses, the subtype specific 
consensus genes will be more effective since the 
5 genetic distances between subtype consensus genes 
and field viruses from the same subtype will be 
smaller than that between group M consensus genes 
and these same viruses. Therefore, consensus genes 
need to be generated for development of subtype A 

10 specific immunogens . The codons of the A.con.env 
gene were optimized based on the codon usage of 
highly expressed human genes. (See Figs. 18A and 
18B for amino acid and nucleic acid sequences, 
respectively. > 

is Each pair of the oligos has been amplified, 

cloned, ligated and sequenced. After the open 
reading frame of the A. con env gene was confirmed by 
an in vitro transcription and translation system, 
the A. con env gene was transfected into the 293T 

20 cells and the protein expression and specificity 

confirmed with the Western blot assay (Figure 18) . 
It was then determined whether A. con envelope is 
biologically functional. It was co- transfected with 
the env-def ective SG3 proviral clone into 2 93T 

25 cells. The pseudotyped viruses were harvested and 
used to infect JC53BL cells. Blue cells were 
detected in JC53-BL cells infected with the A. con 
Env-pseudovirions, suggesting that A. con Env protein 
is biologically functional (Table 6) . However, the 

30 infectious titer of A. con Env-psuedovirions was 

about 7-fold lower than that of pseudovirions with 
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wild-type subtype C envelope (Table 6) . Taken 
together, the biological function A. con Env proteins 
suggests that it folds correctly and may induce 
linear and conformational T and B cell epitopes if 
5 used as an Env immunogen. 



JC53BL13 (lU/ul) 





3/31/03 


4/7/03 


4/25/03 




non filtered supt. 


0.22|jm filtered 


0.22|jm filtered 


A.con +SG3 


4 


8.5 


15.3 


96ZM651 +SG3 


87 


133 


104 


SG3 backbone 


0 


0.07 


0.03 


Neg control 


0 


0.007 


0 



Table 6. Infectivity of pseudovirons with A.con env genes 



EXAMPLE 6 

io Design of Full Length "Consensus of the Consensus 

Qragr, pol and nef Genes" (M. con -gag, M.con.pol and 
M.con.nef) and a Subtype C Consensus pol Gene 

(C. con. pol) 

is For the group M consensus genes, two different 

env genes were constructed, one with virus specific 
variable regions (CON6) and one with consensus 
variable regions (Con-S) . However, analysis of T 
cell immune responses in immunized or vaccinated 

20 animals and humans shows that the env gene normally 
is not a main target for T cell immune response 
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although it is the only gene that will induce 
neutralizing antibody- Instead, HIV-1 Gag, Pol and 
Nef proteins are found to be important for inducing 
potent T cell immune responses. To generate a 
5 repertoire of immunogens that can induce both 

broader humoral and cellular immune responses for 
all subtypes, it may be necessary to construct other 
group M consensus genes other than env gene alone, 
"Consensus of the consensus 11 gstg, pol and nef genes 
10 (M. con. gag., M. con. pol and M. con. nef) have been 

designed. To generate a subtype consensus pol gene, 
the subtype C consensus pol gene (C. con. pol) was 
also designed. The codons of the M. con. gag., 
M. con. pol, M. con. nef and C. con. pol. genes were 
15 optimized based on the codon usage of highly 

expressed human genes. (See Fig. 19 for nucleic 
acid and amino acid sequences . ) 

EXAMPLE 7 

Synthetic Subtype B Consensus gag and env Genes 

EXPERIMENTAL DETAILS 

Subtype B consensus gag and env sequences were 
derived from 37 and 137 contemporary HIV-1 strains, 
respectively, codon-usage optimized for mammalian 
cell expression, and synthesized (Figs. 20A and 
20B) . To ensure optimal expression, a Kozak 
sequence (GCCGCCGCC) was inserted immediately 
upstream of the initiation codon. In addition to 
the full-length env gene, a truncated env gene was 
generated by introducing a stop codon immediately 



20 



25 
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after the gp41 membrane -spanning domain (IVNR) to 
create a gpl45 gene. Genes were tested for 
integrity in an in vitro transcription/translation 
system and expressed in mammalian cells. (Subtype B 
5 consensus Gag and Env sequences are set forth in 
Figs. 20C and 20D, respectively.) 

To determine if the subtype B consensus 
envelopes were capable of mediating fusion and 
entry, gpl60 and gpl45 genes were co- transf ected 

10 with an HIV-1/SG3 Aenv provirus and the resulting 

pseudovirions were tested for infectivity using the 
JC53-BLi cell assay. JC53-BL cells are a derivative 
of HeLa cells that express high levels of CD4 and 
the HIV-1 coreceptors CCR5 and CXCR4 . They also 

15 contain the reporter cassettes of luciferase and P- 
galactosidase that are each expressed from an HIV-1 
LTR . Expression of the reporter genes is dependent 
on production of HIV-1 Tat. Briefly, cells are 
seeded into 24 -well plates, incubated at 37°C for 24 

20 hours and treated with DEAE-Dextran at 37°C for 

3 0min. Virus is serially diluted in 1% DMEM, added 
to the cells incubating in DEAE-dextran, and allowed 
to incubate for 3 hours at 3 7°C after which an 
additional 500/zL. of cell media is added to each 

25 well. Following a final 48-hour incubation at 37°C, 
cells are fixed, stained using X-Gal, and overlaid 
with PBS for microscopic counting of blue foci. 
Counts for mock- infected wells, used to determine 
background, are subtracted from counts for the 

30 sample wells. Co-receptor usage and envelope 
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neutralization sensitivity were also determined with 
slight modifications of the JC53-BL assay. 

To determine whether the subtype B consensus 
Gag protein was capable of producing virus-like 
5 particles (VLPs) that incorporated Env 

glycoproteins, 293T cells were co-transfected with 
subtype B consensus, gag and env genes. 48-hours 
post-transfection, cell supernatants containing VLPs 
were collected, clarified in a tabletop centrifuge, 

10 filtered through a 0.2mM filter, and pellet through 
a 2 0% sucrose cushion. The VLP pellet was 
resuspended in PBS and transferred onto a 20-60% 
continuous sucrose gradient. Following overnight 
centrif ugation at 100,000 x g, 0 . 5 ml fractions were 

15 collected and assayed for p24 content. The 

refractive index of each fraction was also measured. 
Fractions with the correct density for VLPs and 
containing the highest levels of p24 were pooled and 
pellet a final time. VLP -containing pellets were 

20 re -suspended in PBS and loaded on a 4-2 0% SDS-PAGE 
gel. Proteins were transferred to a PVDF membrane 
and probed with serum from a subtype B HIV-1 
infected individual. 

RESULTS 

25 

Codon-usage optimized, subtype B consensus 
envelope (gpl60, gpl45) and gag genes express high 
levels of glycoprotein in mammalian cells (Fig. 21) . 

Subtype B gpl60 and gpl45 glycoproteins are 
30 efficiently incorporated into virus particles. 
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Western Blot analysis of sucrose-purified 
pseudovirions suggests at least five -fold higher 
levels of consensus B envelope incorporation 
compared to incorporation of a rev-dependent 
5 contemporary envelope (Fig.23A). Virions 

pseudotyped with either the subtype B consensus 
gpl60 or gpl4 5 envelope are more infectious than 
pseudovirions containing a rev- dependent 
contemporary envelope (Fig. 23 B) . 

10 Subtype B consensus envelopes utilize CCR5 as 

the co-receptor to gain entry into CD4 bearing 
target cells (Fig. 22) . 

The infect ivity of pseudovirions containing the 
subtype B consensus gpl60 envelope was neutralized 

is by plasma from HIV-1 subtype B infected patients 
(Fig. 24C) and neutralizing monoclonal antibodies 
(Fig. 24A) . This suggests that the subtype B 
synthetic consensus B envelopes is similar to native 
HIV-1 Env glycoproteins in its overall structure and 

20 that common neutralization epitopes remain intact. 

Figs. 24B and 24D show neutralization profiles of a 
subtype B control envelope (NL4 . 3 Env) . 

Subtype B consensus Gag proteins are able to 
bud from the cell membrane and form virus -like 

25 particles (Fig. 25A) . Co- transf ection of the codon- 
optimized subtype B consensus gragr and gpl60 genes 
produces VLPs with incorporated envelope (Fig. 25B) . 
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CONCLUSIONS 



The synthetic subtype B consensus env and gragr 
genes express viral proteins that are similar in 
their structure, function and antigenicity to 
contemporary subtype B Env and Gag proteins. It is 
contemplated that immunogens based on subtype B 
consensus genes will elicit CTL and neutralizing 
immune responses that are protective against a broad 
set of HIV-1 isolates. 

* * * 

All documents and other information sources 
cited above are hereby incorporated in their 
entirety by reference. 
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ABSTRACT 



The present invention relates, in general, to 
an immunogen and, in particular, to an immunogen for 
inducing antibodies that neutralize a wide spectrum 
of HIV primary isolates and/or to an immunogen that 
induces a T cell immune response. The invention 
also relates to a method of inducing anti-HIV 
antibodies, and/or to a method of inducing a T cell 
immune response, using such an immunogen. The 
invention further relates to nucleic acid sequences 
encoding the present immunogens . 
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MRVMG IQRNCQHX.WRWGTM 



II^OJMI CSTU^ENLWVTVYYGVP VV^^ 



hacvptdp: 



NPQE IVLENVTENFNMWKNNMVEQMHED 1 1 SLWDQSLKPCVKLTPLCVTLNCTN VRKVS SK< 



-VI 



G 



TETDNEEIKKCSFI^TTELiRDKKQKVYAIjFYRIjDVVP IDDKNSSE i sgkks seyyrijIncnts aitqacp 
KVSFEPIPIHYC^AGFAILKC^KKFNGTGPCKKVSTVQCTHGIK^ 



ITNNAKTI IV< 



QIjNESVEINCTRPNMNTRKSIHIGPGQAFYATGEI IGD IRQJ^CNISRTKWIJKTIjQQVAK 



V4 



r 



KIiREHFNMKTIIFKPSSGGDIiEITTHSFWCGGEFFYCKTSGIiFKSTWMFyGTYMFNGTKDNSETITLPCR 

V5 



r 



"i 



IKQIINWWQGTVGQAMYAPPIEGKITCKSNIT^ 

WKI E P LGVAPTKAKRR VVERE KRAVG I GAVFLG FLG AAG S TNG AAS I TLTVQ ARQLL SGI VQ QQ S NLiLR 
AIEAQQHI*LQLT\A^G IKQZjQARVIiAVERYIiKDQQLLG I WGCS GKLI CTTNV 

W^S^ESSiS^^^ ^ ^ YR^- 2 1 EE S QNQQEKNEQEIiLALiDKW AS IiWNWFD I TNWLWY I K I F IM IVGG L I GLiR I 
VFAVLS I VNRVRQGYSPLSFQTXiIPNPRGPDRPEGIEEEGGEQGRDRS IRLVNGFLALAWDDLRSLCLFS 
YHRI^FILIAARTXTCI^RRSL^ 
EIVQRACRAILNIPRRIRQGIiERALL 



owwqi Fgtgn 



TO 



gplSO 
0P140CF 



HP120 C 



c 

gp120 gp140CF 




MW 

198 kd 
115 kd 

93 kd' 



BEST AVAILABLE COPY 



CONS.env (group M env consensus. This one contain five variable regions In env gene from 98CN006 virus, not in the 

CTGGGCATGCTGATGATCTGCTCCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGC 

GTGCCCGTGTGGAAGGAGGCCAACACCACCCTGTTCTGCGCCTCCGACGCCAAGGCCTAC 

GACACCGAGGTGC^CAACGTGTGGGCCACCCACGCCTGCGTGCCC^CCGACCCCAACCCC 

CAGGAGATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTG 

GAGCAGATGCACGAGGACATCATCTC CCTGTGGG ACGAGTCCCTGAAGCC CTG CGTGAAG 

CTGACCC CCCTGTGCGTGACCCTGAACTGCAC CAACGTG CGCAACGTGTCCTC CAAC GGC 

ACC GAGACCG ACAACGAGGAGATCAAGAACTG ^^^^^^^^^^Sj^SSPf^^^^^ 

GACAAGAAGCAGAAGGTGTACGCCCTGTTCTACCGCCTGGACGTGGTC 

AAGAACTCCTCCGAGATCTCCGGCAAGAACTC CTC CGAGTACTACCG CCTGATGAACTGC 

AACACCTCCGCC^TC^CCCAGGCCTGCCCCWVGGTGTCCTTCGAGCCCATCCCCATCCAC 

TACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCAGC 

GGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCACGGCATCAAGCCCGTGGTGTCC 

ACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAAC 

ATCACCAACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGC 

ACCCGCCCCAACAACAACACCCGCAAGTCCATCCACATCGGCCCCGGCCAGGCCTTCTAC 

GCCACCGGCGAGATCATCGGC GACATCCGC CAGGCCGACTGCAACATCTCCCGCAC C AAG 

TG GAACAAGAC C CTGCAG C AG GTGG CCAAGAAGCTGC GC GAGGACTTC AACAACAAGACC 

ATCATCTTCAAG CCCTCCTC CGG C G G CGACCTG GAGATCAC CACCCACTC CTTCAACTGC 

GG CG GCGAGTTCTTCTACTGCAACAC CTC CGG C CTGTTCAACTCCAC CTGGATGTTCAAC 

GGCACCTACATGTTCAACGGCACCAAGGACAACTCCGAGACCATCACCCTGCCCTGCCGC 

ATC AAGCAGATCATCAAC ATGTGGCAGGG CGTGGGCCAG GCCATGTACGCCC CCCCCATC 

GAGGGCAAGATCACCTGCAAGTCCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGC 

AACAACTCCAACAAGAACAAGACCGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGAC 

AACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCC 

CCCACCAAGGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCC 

GTGTTCCTGGGCTTCCTGGGC GCCGCCGGCTCCAC CATGGGC GCCGCCTCCATCACCCTG 

ACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGC 

G CGATCGAGGCCCAGCAGCACCTGCTGCAGCTGAC CGTGTGGGGCATCAAG CAGCTGCAG 

G CCCGCGTG CTGGCCGTGGAGC GCTACCTGAAGGACCAG CAGCTG CTGGGCATCTG GGG C 

TGCTCC GGCAAGCTGATCTGCAC CACCAACGTGCCCTGGAACTC CTCCTGGTC C AAGAAG 

TC C C AG GACGAGATCTG GGACAACATGAC CTG GATG GAGTGG GAGC GC G AGATCTC CAAC 

TACACCGACATCATCTACCGCCTGATCGAGGAGTCCCAGAACCAGCAGGAGAAGAACGAG 

CAGGAGCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCGACATCACCAAC 

TGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATC 

GTGTTCGCCGTGCTGTC CATC GTGAACCGCGTGCGCCAGGGCTACTCCC CCCTGTCCTTC 

CAGACCCTGATCCCCAACCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGC 

GGCGAGCAGGGCCGCGACCGCTCCATCCGCCTGGTGAACGGCTTCCTGGCCCTGGCCTGG 

GAC G ACCTG CGCTCC CTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTTC ATC CTG ATC 

GCCGCCCGCACCGTGGAGCTGCTGGGCCGCCGCTCCCTGCGCGGCCTGCAGAAGGGCTGG 

GAGGCCCTGAAGTAC CTGGGCAACCTGCTGC AGTACTGG GGCCAG GAG CTGAAGAACTCC 

GCCATCTCCCTGCTGGACACCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCGTGATC 

GAGATCGTGCAGCGCGCCTGCCGCGCCATCCTGAACATCCCCCGCCGCATCCGCCAGGGC 

CTG GAGCGCGCC CTGCTGTAA 
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HIV-1 antibody and specificity 
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hi 




I 









<> ^ <K 





93TH051J 
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990890_G 



CON6 



Subtype B peptides (MN) 
Subtype C peptides (ChM9) 
Consensus peptides (CONS) 
No peptide 




CON6 JRFL 96ZM651 pcD NA3.1 

gp140CF gp120 gp120 



Immunogen 




,.. r. „ nM « ra | -nv T>» amino add sequence is different from Los Alamos Database Auoust 2002) 

gcTgc^atgc*^ 



ACGTGACCAACGCCACCAACAACACCTAC^CGGC^^TG^^CTGC 

TCCTTCAACATG 
CCTGTTCTACCG 
ACCGCCTGATC/ 

£S5£aSSSg^^ 

iil^^cAG^^^ 

rc^H^i^^^^SA^c^^ c 

caacTgc^gcggc^^^ 

CCACCTAC^CAACAACACCAACTCCAAra 



^CAGCAG-GTGGCC^^ 

cctgaag^ccag^gctgctgggc^tctggggctgctcx:gg^gctga 

TCTGCACCACCGCCGTTGCCCTGGAACTCCTCCTGGTCf^ 

caactacaci 
aggagaag* 

ccatcgtgaaccgcgtgcgc^^ 
gcatccgccagggcttcgaggccgccctgctgtaa 



C con env (subtype C consensus env. The amino arid sequence is different from Los Alamos Database August 2002) 

GCCGC CATGCGCGTGATGGGCATCCTGCGCAACTGCCAG CAGTGGTGGAT 

CTGGGGCATCCTGGGCTTCTGGATGCTGATGATCTGCAACGTGGTGGGCA 

ACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAAG 

ACCACCCTGTTCTGCGCCTCCGACGCCAAGGCCTACGAGAAGGAGGTGCA 

CAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGG 

AGATGGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACGAC 

ATGGTGGAC CA GATG CACGAG GACATCATCTC C CTGTGG GAC CAGTCCCT 

GAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCCGCA 

ACGTGACCAACGCCACCAACAACACCTACAACGAGGAGATCAAGAACTGC 

TC CTTC AACATCAC CACC GAGCTG CGCGACAAGAAGAAGAAG GTGT AC GC 

CCTGTTCTACCGCCTGGACATCGTGCCCCTGAACGAGAACTCCTCCGAGT 

ACCGCCTGATCAACTGC^CACCTCCGCCATCACCCAGGCCTGCCCCAAG 

GTGTCCTTCGACCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTACGC 

CATCCTGAAGTGCAACAACAAGACCTTCAACGGCACCGG CCCCTGC AACA 

AC GTGTC CACC GTGCAGTG CAC CCAC G G CATGAAG CC CGTGGTGTC CAC C 

CAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTC 

CGAGAACCTGACCAACAACGCCAAGACCATCATCGTGCACCTGAACGAGT 

C CGTGGAGATCGTG TGCACC C G CCCCAACAACAAGACCC G CAAGTC CATC 

CGCATCGGCCCCGGCCAGACCTTCTACGCCACCGGCGACATCATCGGCGA 

CATCCGC CAGGCCCACTGCAACATCTCCGAGGACAAGTGGAACAAGACCC 

TG CAG C GCGTGTC CAAGAAGC TGAAG GAG CACTTC C CCAACAAGACC ATC 

AAGTTC GAGCCCTCCTC CG G C GGC GACCTGGAGATCACCACCCACTC CTT 

CAACTGCCGCGGCGAGTTCTTCTACTGCAACACCTCCAAGCTGTTCAACT 

CCACCTACAAO*ACAA(^C(WVCrrCCAACTC 

C G CATCAAG CAGATCATCAACATGTGGCAGGAG GTGGGCCG CG CCATGTA 

CGCCCCCC(XATCGCCGGCAACATC^CCTGa»uAGTCC^C^TC^CCGGCC 

TGCTGCTGACCC GCGACGGC GGCAAGAAGAACACCACCGAGATCTTC C GC 

CC CGGCGGCGGCGACATGCGCGACAACTGGCGCTCC GAGCTGTACAAGTA 

CAAGGTGGTGGAGATCAAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGC 

GCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCGTGTTC 

CTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCAC 

CCTGACC GTGCAGGCC C G CCAGCTGCTGTCCGGCATCGTGCAGCAGCAGT 

CCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACATGCTGCAGCTGACC 

GTGTGGGGCATCAAGCAGCTGCAGACCCGCGTGCTGGCCATCGAGCGCTA 

CCTGAAGGAC CAGCAG CTGCTGGGCATCTGG GGCTGCTCCGGCAAGCTGA 

TCTGCACCACCGCCGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAG 

GAGGAC ATCTG G GACAACATGAC CTG GATG C AGTG GGAC CGC GAGATCTC 

CAACTACACCGACACCATCTACCGCCTGCTGGAGGACTCCCAGAACCAGC 

AGGAGAAGAAC GAGAAGGAC CTGCTG G C CCT GGACTCCTGGAAGAAC CTG 

TGGAACTGGTTC GACATCAC CAACTGG CTGTGGTACATCAAGATCTTCAT 

CATGATCGTGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCGTGCTGT 

CCATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACC 

CTGACCCCCAACCCCCGCGGCCCCGACCGCCTGGGCCGCATCGAGGAGGA 

GGGCGGCGAGCAGGACCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCC 

TGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCAC 

CGCCTGCGCGACTTCATCCTGGTGGCCGCCCGCGCCGTGGAGCTGCTGGG 

C CGCTC CTCCCTGCGCGGCCTGCAGCGC GG CTGGGAGGCCCTGAAGTACC 

TGGGCTCCCTGGTGCAGTACTGGGGCCTGGAGCTGAAGAAGTCCGCCATC 

TCCCTGCTGGACACCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCAT 

CATCGAGCTGATCCAGCGCATCTGCCGCGCCATCCGCAACATCCCCCGCC 

GCATCCGCCAGGGCTTCGAGGCCGCCCTGCAGTAA 



EPSSGGDLErTTHSFNCRGEFFYC^^ 
LL 



K <o3> 



C.con.< 



LQ 



Synthesize entire gene in 80-mer fragments overlapping by 20 residues at the 3* end with invariant 

sequences at the 5' end. 



EcoRl Bbsi 



80mer fragment 



BsmBI Bam H I 



80mer fragment 



Invariant Seq. 
(18 nt) 



Invariant Seq. 



Paired 80mer oligos are connected via PCR in a stepwise manner from 5* to 3 1 using 
primers complimentary to the invariant seq. 



5' primer 



80mer Fragment 



140bp PCR product 



3' primer 



108bp PCR fragments cloned into pGEM-T and sequenced. Clones with the proper sequence will be 
cut with 2 restriction enzymes. 4 fragments will be ligated together with pcDNA3.1 in a stepwise 

manner from the 5' to 3' end of gene 



Fragments to be iigated with 
pcDNA3.1 

(1-4 are In order from 5* to 3') 


Restriction Enzymes Used 
to Cleave Fragment 


Fragment 1 


EcoRI/BsmBI 


Fragment 2 


Bbsl/BsmBI 


Fragment 3 


Bbsl/BsmB! 


Fragment 4 


Bbsl/BamHI 


PCDNA3.1 


EcoRI/BamHI 



Fragment 2 



Fragment 3 




Fragment 4 
BamHI 



i 

Ligations will be repeated stepwise 5' to 3' until the entire gene has been cloned 

into pcDNA3.1 
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Pseudotyped virions 

infect (± plasma or co-receptor inhibitor) 
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Ccon gpl60 




10* io-» 
Plasma Dilution 
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10* 



10- 1 
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Plasma from HTV-1 subtype C infected patients Plasma from uninfected donors 



Oft 



Gag 
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p24 



Nef 



p33 
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-■MM 



Cell lysata Supernatant 



\ M I 

Ceil lysata Supernatant 




!g!SghSSkd™^fl5^shkgrpgnflqsrpeptappaesfrfeettpa 

PKQEPKDREPLTSLKSLFGSDPLSQ 



TGCGCCCCGGCGGCAAGAAGCGCTACATG^^^ 
AGATCGAGGTGCGCGACACCAAGGAGGCCCTGG^ 

AGCGACC CCCTGAGC CAGTAA 



TGA 




13^ 



CONs env (gorup M consensus env gene. This one contain the consensus sequence tar variable regions in env gene) 

MRVTIGIQRNCQHLWRWGTLILGMLMICSAAE 

WATHACVPTDPNPQEIVLENVTENFNMWKNNMVEQM 

TTNNTEEKGEIKNCSFNITTEIRDKKQKVYALFYRL^^ 

EPIPIHYCAPAGFAILKCNDKKFNGTGPCKNVSTVQCTHGIKPWSTQUa-NGSLAEEElHRSENITNN 
AKTIIVOLJ^IESVE!NCTRPNN^r^RXSlRIGPGQAFYAT^ 
HFNNICniFKPSSGGDLErTTHSFNCRGEFFYChrTSGL^NSTWGN 
WOCVGOAIOTAPPlEGKITCKSNITGUJ/rRDGGNNN^ 

VAPTKAKRRVv^REKRAVGIGAvTLGFLGAAGSTMGAASnXTVQARQLLSGIVQQQSNLLRAIEAQQHL 
LQ LTNAWGl KQLQARVLAVERYUCDQQU.G1WG C SGKLI CTTTVP WNSSWSNKSQD EIWDN MTWMEWE REI 
NNYTDUYSUEESQNQQEKNEQELLALDKWASLWNWFDn^^ 

NRVRQGYSPLSFQTulPNPRGPDRPEGIEEEGGEQDRDRSIRLVTslGFUALAVVDOLRSLCLFSYHRLRDFl 
L1AARTVELLG R KG LRRGWEALKYL WN LLQ YWGQ ELKN SAIS LLDTTA1AVAEGTDR VI EWQ RAC RA I L 
NIPRRIRQGLERALL 



^ . ^ ntflin the consensus sequence for variable regions in env gene. 
cons env (oorup M consensus env gene. This one contamthe consensus *<m 

^e e n5c» 



■■■■■< 
■■SHSB 

GAGCGCGCCCTGCTGTAA 



gpl60 
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Cell lysate Supernatant 

Expression of CONs envgeoe in mammalian cells 
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CON» 
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AMD3100 



TAK779 



Infect* vity and coreceptor usage of CON6 and CONs en v genes 



A 




C 
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Lysate Supernatant 5%dVeV 

Env protein incorporation in CON6 andCONs Env-pseudovirions 



WATHACVPTDPNPQBNLENVTEEFNMWKNNMVEQMH^^ 
NITNrTDNMKGEIKNCSFNMTTEU^DKKQKV/YSLFYKLDWQINKSNSSSQ 

FEP I P I HYCAP AG FAJ LKC KDKEF N GTG P C KNVSTVQ CTH G I KP WSTQ LLLN G S LAEEEVM I RS EN ITN 
K^NNKTIIFTNSSGGDLEITTHSFNCGGEf^ 

I NM WQRVGQAM YAPPI Q GVI RC ES N ITGLLLTROGGD NNS KN ^^^^^f^? 5?S?#^SS?mV"X^TI^C^I!^ 
LGVAPTKAKRRNA/EREKRAVGI GAVFLGFUGAAGSTMGAASrTLTVQARQlXS G IVQQQ SNLLRAIEAO Q 

HU.KLTVWGIKQLQARVIJVVERYLKDQQU-GIWGCSGKU 

EISNYTDIIYNUEESQNQQEKNEQDLLALDKWANLVWWFDISNWLWYJJOR 

VINRVRQGYSPLSFQTHTPNPGGUDRPGmEEEGGEQGRDRSlFUJ/SGFl^^^ 

RUAARTVEUXSHSSLKGLRLGWEGLKYLWN^^ 

CRAILNIPRR1RQGLERALL 



» Mn anv /aubtvoo A consensus env. Identical amino acid sequence to tnatlnthe public domain) 

CC^GCCCTGCGTGAAGCTGACCCCCCTGTGCG^ 

CC/*ACGTGAACGTGACCACCAACATCACCAACATCACCGACAACATGA^G 

GGOGAGATCAAGAACTGCTCCTTCAACATGACCACCGAGCTGCGCGACAA 

GAAGCAGAAGGTGTACTCCCTGTTCTACAAGCTGG^CGTGGTGCAGATCA 

ACAAGTCCAACTCCTCCTCCCAGTACCGCCTGATCAACTGCAAOJCGTCC 

GCGATCACCCAGGCCTGCCCCAAGGTGTGCTTCGAGCCCATCCCCATGCA 

CTACTGCGCCCCCGCCGGCTTCGCCATCCTGAAGTGCAAGGA^GGAGT 

JCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCAC 

GG^TCAAGCCCGTGGTGTCCACCCAGCTGCTGCTG^CGGCTCCCTGGC 

CGAGGAGCVkGGTGATGATCCGCTCCGAGAACATCACCAACAACGCCAAGA 

AGATX^O^GTGCAGCTGACCAAGCCCGTGAAGATCAACTGCVVCCCGOCCC 

AACAACAACACCCGCAAGTCCATCCGCATCGGCCCCGGCCAGGCCTTCTA 

CGCCM^GGCGACATCATC 

CCCGCACCGAGTGGAACGAGACCCTGCAGAAGGTGGCCAAGCAGCTGCGC 
AAGTM^TTGAACAACAAGACCATCATCTTCACCAACTCCTCCGGCGGCGA 

^cStccagggcgtgatccgctgcgagtccaacatcaccg<^tgctg 
ctgacccgcgacggcggcgacaacaactccaagaacgagaccttccgccc 

CGGCGGCGGC GACATGCGCGACAACTGGCGCTC CGAG CTCTACAAGTACA 
AGGT^TGAAGATCGAGCCCCTGGGCGTGGCCCCC^CCAAGGCCAA^GC 

CG C GTG GTG^AGCGC^AGAAG CGC GCCGTG GG CATC GG C ^T^TTSSJ 
GGGCTTCCTGGGCGCCGCCGGCTCCACC^TGGGCGCCGCCTCC^T^CC 
TGACCGT G GAGGC C CGCCAGCTGCTGTCCGG GATCGTGGAGCAG CAGTCC 
AACCTGCTGCGCGCCATCGAGGCCGAGCAGCACCrTGCTGAAGC^GAC 
GTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTCGAGCG^^ 
TGAAGGAC CAGCAG CTGCTGGGCATCTGGGGCTGCTCCGG CAAOCTOATC 
TGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAAGAAGTC CCAGTC 
CGAGATCTGGGACAACATGACCTGGCTGCAGTGGGACAAGGAGATCTCCA 
act ACAC cgac ATCATCTACAAC CTGATCGAGGAGTCCCAGAAC CAGCAG 
GAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCAACCTGTG 
GAACTGGTTC GACATCTC GAACTGG CTGTGGTACATCAAGATCTTCATCA 
TGATCGTGGGCGGCCTGATCGG CCTGCG CATCGTGTTC GCCGTGCTGTCC 
GTGATCAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCA 
SccS^CCCCGGCGGCCTGGACCGCCCCGGCCGCATCGAG^GAGG 

GC GG CGAGCAGGG C C GCGAC C GCTCC ATCC GC CTG ^^^^^o? a^?ZT S ^ JP 
GCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTAC^CCG 

CCTGCGCGACTTCATC CTGATCGC CGC CC GCACCGTGGAG ctgctgg gcc 

ACTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAACTACCTG 

TGGAACCTGCTGCTGTACTGGGGCCGCGAGCTGAAGATCTCCGC^TC^ 

CCTGCTGGACACCATCGCCATCGCCGTGGCCGGCTGC^^ 

TCGAGATCGGCCAGCGCATCTGCCGCGCCATCCTGAACATCCCCCGCCGC 

ATCC GCC AGGGCCTGGAGCGCGCCCTGCTGTAA 



M con. sag (group M consensus gag. Identical amino acid sequence to that in the public domain) 

GCCGCCGCCATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGA . 

CGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACCGCC 

TGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAAC 

CCCGGCCTGCTGGAGACCTCCGAGGGCTGCAAGCAGATCATCGGCCAGCT 

GCAGCCCGCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAACA 

CCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGAGGTGAAGGACACC 

AAG GAGGC C CTGGAGAAGATC GAGGAG G AGCAGAACAAGTC CCAGCAGAA 

GACCCAGCAGGCCGCCGCCGACAAGGGCAACTCCTCCAAGGTGTCCCAGA 

ACTAC C CC ATCGTG CAGAAC CTGCAGG GCCAGATGGTGCACCAG G CCATC 

TCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTT 

CTCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCC 

CCCAGGACCTGAACACCATGCTGAACACCGTGGGCGGCCACCAGGCCGCC 

ATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCG 

CCTGCACCCCGTGCACGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGC 

CC CG C GG CTCC GACATCGC CGGCACCAC CTC CAC CCTGCAGGAG CAGATC 

GC CTG GATGACCTC CAAC C C C C CCATC CC CGTGG GCGAGATCTACAAGC G 

CTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGTACTCCCCCGTGT 

CCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTG 

GACCG CTTCTTC AAGAC C CTGCGC GC C GAGCAGGCCAC C CAGGACGTGAA 

GAACTGGATGAC C GACACC CTGCTGGTG CAGAAC GCCAAC CCC GACTGCA 

AGACCATCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATG 

ACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGGCCCGCGTGCTGGC 

CGAGGCCATGTCCCAGGTGACC AACGCC GC CATCATGATG CAGC G CGG CA 

ACTTCAAG GG CCAG CGC C G CATCATCAAGTG C TTCAACTG CGG CAAGGAG 

GGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAA 

GTG C GGCAAG GAGGGC CAC CAGATGAAG GACTGCAC C GAG CGC GAGGC CA 

ACTTCCTGGGCAAGATCTGGCCCTCCAACAAGGGCCGCCCCGGCAACTTC 

CTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGTCCTTCGGCTT 

CGGCGAGGAGATCACCCCCTCCCCCAAGCAGGAGCCCAAGGACAAGGAGC 

CCCCCCTGACCTCCCTGAAGTCCCTGTTCGGCAACGACCCCCTGTCCCAG 

TAA 



M.con.pol.nuc 

GCCGCCGCCatgccccagatcaccctgtggcagcgccccctggtgaccat 

caagatcggcggccagctgaaggaggccctgctggccaccggcgccgacg 

acaccgtgctggaggagatcaacctgcccggcaagtggaagcccaagatg 

atcggcggcatcggcggcttcatcaaggtgcgccagtacgaccagatcct 

gatcgagatctgcggcaagaaggccatcggcaccgtgctggtgggcccca 

cccccgtgaacatcatcggccgcaacatgctgacccagatcggctgcacc 

ctgaacttccccatctcccccatcgagaccgtgcccgtgaagctgaagcc 

cggcatggacggccccaaggtgaagcagtggcccctgaccgaggagaaga 

tcaaggccctgaccgagatctgcaccgagatggagaaggagggcaagatc 

tccaagatcggccccgagaacccctacaacacccccatcttcgccatcaa 

gaagaaggactccaccaagtggcgcaagctggtggacttccgcgagctga 

acaagcgc^cccaggacttctgggaggtgcagctgggcatcccccacccc 

gccggcctgaagaagaagaagtccgtgaccgtgctggacgtgggcgacgc 

ctacttctccgtgcccctggacgaggacttccgcaagtacaccgccttca 

ccatcccctccatcaacaacgagacccccggcatccgctaccagtacaac 

gtgctgccccagggctggaagggctcccccgccatcttccagtcctccat 

gaccaagatcctggagcccttccgcacccagaaccccgagatcgtgatct 

accagtacatggacgacctgtacgtgggctccgacctggagatcggccag 

caccgcgccaagatcgaggagctgcgcgagcacctgctgcgctggggctt 

caeca cccccgacaagaagcaccagaaggagccccccttcctgtggatgg 

gctacgagctgcaccccgacaagtggaccgtgcagcccatccagctgccc 

gagaaggactcctggaccgtgaacgacatccagaagctggtgggcaagct 

gaactgggcctcccagatctaccccggcatcaaggtgaagcagctgtgca 

agctgctgcgcggcgccaaggccctgaccgacatcgtgcccctgaccgag 

gaggccgagctggagctggccgagaaccgcgagatcctgaaggagcccgt 

gcacggcgtgtactacgacccctccaaggacctgatcgccgagatccaga 

agcagggccaggaccagtggacctaccagatctaccaggagcccttcaag 

aacctcaagaccggcaagtacgccaagatgcgctccgcccacaccaacga 

cgtgaagcagctgaccgaggccgtgcagaagatcgccaccgagtccatcg 

tgatctggggcaagacccccaagttccgcctgcccatccagaaggagacc 




tgggagacctggtggaccgagtactggcaggccacctggattcccgagtg 

ggagttcgtgaacaccccccccctggtgaagctgtggtaccagctggaga 

aggagcccatcgccggcgccgagaccttctacgtggacggcgccgccaac 

cgcgagaccaagctgggcaaggccggctacgtgaccgaccgcggccgcca 

gaaggtggtgtccctgaccgagaccaccaaccagaaaaccgagctgcagg 

ccatccacctggccctgcaggactccggctccgaggtgaacatcgtgacc 

gactcccagtacgccctgggcatcatccaggcccagcccgacaagtccga 

gtccgagctggtgaaccagatcatcgagcagctgatcaagaaggagaagg 

tgtacctgtcctgggtgcccgcccacaagggcatcggcggcaacgagcag 

gtggacaagctggtgtccaccggcatccgcaaggtgctgttcctggacgg 

catcgacaaggcccaggaggagcacgagaagtaccactccaactggcgcg 

ccatggcctccgacttcaacctgccccccatcgtggccaaggagatcgtg 

gcctcctgcgacaagtgccagctgaagggcgaggccatgcacggccaggt 

ggactgctcccccggcatctggcagctggactgcacccacctggagggca 

agatcatcctggtggccgtgcacgtggcctccggctacatcgaggccgag 

gtgatccccgccgagaccggccaggagaccgcctacttcatcctgaagct 

ggccggccgctggcccgtgaaggtgatccacaccgacaacggctccaact 

tcacctccgccgccgtgaaggccgcctgctggtgggccggcatccagcag 

gagttcggcatcccctacaacccccagtcccagggcgtggtggagtccat 

gaacaaggagctgaagaagatcatcggccaggtgcgcgaccaggccgagc 

acctcaagaccgccgtgcagatggccgtgttcatccacaacttcaagcgc 

aagggcggcatcggcggctactccgccggcgagcgcatcatcgacatcat 

cgccaccgacatccagaccaaggagctgcagaagcagatcaccaagatcc 

agaacttccgcgtgtactaccgcgactcccgcgaccccatctggaagggc 

cccgccaagctgctgtggaagggcgagggcgccgtggtgatccaggacaa 

ctccgacatcaaggtggtgccccgccgcaaggccaagatcatccgcgact 

acggcaagcagatggccggcgacgactgcgtggccggccgccaggacgag 

gacTAA 




M.con.nef (group M consensus nef. Identical amino acid sequence to that in the public domain) 

GCCGCCGCCATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCC 

CGCCGTGCGCGAGCGCATCCGCCGCACCCACCCCGCCGCCGAGGGCGTGG 

GCGCCGTGTCCCAGGACCTGGACAAGCACGGCGCCATCACCTCCTCCAAC 

ACCGCCGCCAACAACCCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGA 

GGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGA 

CCTACAAGGCCGCCCTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGC 

CTGGAGG GC CTGATCTACTC CAAGAAG C GC CAGGAGATCC TGGACCTGTG 

GGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCG 

GCCC CGGCATCC GCTACCCCCTGACCTTCGGCTG GTGCTTCAAGCTGGTG 

CCCGTGGACCCCGAGGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTC 

CCTGCTGCACCCCATGTGCCAGCACGGCATGGAG GAC GAGGAG CGC GAGG 

TG CTGATGTG GAAGTTCGACTC C CG CCTGGCC CTGC G CCACATC GC C C G C 

G AGCTGC AC CCCGAGTACTACAAGGACTGCTAA 



C.con.pol.nuc 

GCCGCCGCCatgccccagatcaccctgtggcagc^cca:rtggtgtccat 

caaggtgggc^gccagatcaaggaggccctgctggccaccggcgccgacg 

acaccgtgctggaggagatcaacctgcccggcaagtggaagcccaagatg 

atcggcggcatcggcggcttcatcaaggtgcgccagtacgaccagatcct 

gatcgagatctgcggcaagaaggccatcggcaccgtgctggtgggcccca 

cccccgtgaacatcatcggccgcaacatgctgacccagctgggctgcacc 

ctgaacttccccatctcccccatcgagaccgtgcccgtgaagctgaagcc 

cggcatggacggccccaaggtgaagcagtggcccctgaccgaggagaaga 

tcaaggccctgaccgccatctgcgaggagatggagaaggagggcaagatc 

accaagatcggccccgagaacccctacaacacccccgtgttcgccatcaa 

gaagaaggactccaccaagtggcgcaagctggtggacttccgcgagctga 

acaagcgcacccaggacttctgggaggtgcagctgggcatcccccacccc 

gccggcctgaagaagaagaagtccgtgaccgtgctggacgtgggcgacgc 

ctacttctccgtgcccctggac^agggcttccgcaagtacaccgccttca 

ccatcccctccatcaacaacgagacccccggcatccgctaccagtacaac 

gtgrtgTCcxagggctggaagggrtccccxgccatcttccagtcctccat 

gaccaagatcctggagcccttccgcgcccagaaccccgagatcgtgatct 

accagtacatggacgacctgtacgtgggctccgacctggagatcggccag 

caccgcgccaagatcgaggagctgcgcgagcacctgctgaagtggggctt 

caccacccccgacaagaagcaccagaaggagccccccttcctgtggatgg 

gc^cgagctgcacxccgacaagtggaccgtgragcccatccagctgccc 

gagaaggactcctggaccgtgaacgacatccagaagctggtgggcaagct 

gaactgggcctcccagatctaccccggcatcaaggtgcgccagctgtgca 

agctgctgcgcggcgccaaggccctgaccgacatcgtgcccctgaccgag 

gaggccgagctggagctggccgagaaccgcgagatcctgaaggagcccgt 

gc^cggcgtg^ctacgacccctccaaggacctgatcgccgagatccaga 

agcagggccacgaccagtggacctaccagatctaccaggagcccttcaag 

aacctcaagaccggcaagtacgccaagatgcgcaccgcccacaccaacga 

cgtgaagcagctgaccgaggccgtgcagaagatcgccatggagtccatcg 

tgatctggggcaagacccccaagttccgcctgcccatccagaaggagacc 

tgggagacctggtggaccgactactggcaggccacctggattcccgagtg 

ggagttcgtgaacaccccccccctggtgaagctgtggtaccagctggaga 

aggagcccatcgccggcgccgagaccttctacgtggacggcgccgccaac 

cgcgagaccaagatcggcaaggccggctacgtgaccgaccgcggccgcca 

gaagatcgtgtccctgaccgagaccaccaaccagaaaaccgagctgcagg 

ccatccagctggccctgcaggactccggctccgaggtgaacatcgtgacc 

gactcccagtacgccctgggcatcatccaggcccagcccgacaagtccga 

gtccgagctggtgaaccagatcatcgagcagctgatcaagaaggagcgcg 

tgtacctgtcctgggtgcccgcccacaagggcatcggcggcaacgagcag 

gtggacaagctggtgtcctccggcatccgcaaggtgctgttcctggacgg 

catcgacaaggcccaggaggagcacgagaagtaccactccaactggcgcg 

ccatggcctccgagttcaacctgcx^cccatcgtggccaaggagatcgtg 

gcctcctgcgacaagtgccagctgaagggcgaggccatgcacggccaggt 

ggac^gctcccccggcatctggcagctggactgcacccacctggagggca 

agatcatcctggtggccgtgcacgtggcctccggctacatcgaggccgag 



\ ' 1 



gtgatccccgccgagaccggccaggagaccgcctacttcatcctgaagct 

ggccggccgctggcccgtgaaggtgatccacaccgacaacggctccaact 

tcacctccgccgccgtgaaggccgcctgctggtgggccggcatccagcag 

gagttcggcatcccctacaacccccagtcccagggcgtggtggagtccat 

gaacaaggagctgaagaagatcatcggccaggtgcgcgaccaggccgagc 

acctcaagaccgccgtgcagatggccgtgttcatccacaacttcaagcgc 

aagggcggcatcggcggctactccgccggcgagcgcatcatcgacatcat 

cgccaccgacatccagaccaaggagctgcagaagcagatcatcaagatcc 

agaacttc^cgtgtactaccgcgactcccgcgaccccatctggiaagggc 

cccgccaagctgctgtggaagggqgagggcgccgtggtgatccaggacaa 

ctccgacatcaaggtggtgccccgccgcaaggccaagatcatcaaggact 

acggcaagcagatggccggcgccgactgcgtggccggccgccaggacgag 

gacTAA 



SLFGNDPLSQ 



MPQ?£wQR^£^al^ 
VCMGTVLV^PTPVNIIGRNMLT 

MEKEGKISWGPEN PYNTPI FAI KKKD STKWRKLVDFRELN KRTQ DFWEVQLG 'PHP/^SUOOOC^iT^/L.D 

VGOAyVsVPLOEDFTOKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSM^ 

YQYMDDLYVGSDLHGQHRAKIEEU^HUJW/GFTTPDKKHQKEPPF^ 

VYLSV^AHKG GGNEQV^V^ 

whtoSfYIS^^ 

FIHNFKRKGGIGGYSAGERHDIIATDIQTKELQKQITKIQNFR>T(^DSRDPIWKGPAI<LLVVKGEGAW 
IQDNSDIKWPRRKAKHRDYGKQMAGDOCVAGRQDED 




M.con.nef (group M consensus nef) 

MGGKWSKSSIVGWPAVRERIRRTHPAAEGVGAVSQDLDKHGAITSSNTAANNPDCAVA/LEAQEEEEEVGFP 

WPQ\^l^PMTYKAALDLSHR.KEKGGLEGUYSKKRQElU3LW\^HTaGYFPDWQNYTPGPGlRYPLTF 

GWCFKLVPVDPEEVEEANEGENNSLLHPMCQHGMEDEEREVLMWKFDSRUUJ^HIARELHPEYYKDC 



C. L>r^_ VS. t\ 



C.con.pol (subtype C consensus pol) 

MPQrTLWQRPLVSIKVGGQ1KEAU^TGADDTVl£BNU a GKWKPKMIGGIGGFIKVRQYDQIUEICGK 
KAIGTS/WGPTPVNIIGRNMLTQLGCTVKFPISPIETVPVKLKPGMDGPKW 

MEKEGKITXIGPENPYhTTPVFAIKKKDSTKWRKLN^FRELNKRTQDFWEVQLGlPHPAGLKKKKSNnVLD 
VGDAYFSVPLDEGFRKYTAFTIPSINNETPGIRYQYNVIJ*QGVWGSPAIFQSSMTXILEPFRAQNPEIVI 
YQYMDDLWGSDL£IGQHRAKIEELREHUKWGFTTPDKKHQKEPPFLVW^ 
SWTV/NDIQKLVGKLNWASQIYPGIKVRQLCKLLRGAKALTO 

PSKDL1AEIQKQGHDQWTYQ1YQEPFKNLKTGKYAKMRTAHTNDVKQLTEAVQK1AMES1VIWGKTPKFR 
LPIQKETWETWWTDYWQATW^EVVEFVWTPPLVKLVW 

RGRQKIVSLTHTTNQKTELQAIQLALQDSGSEVNIVTOSQYALGHQAQPDKSESELVNQIIEQUKKER 
VYLSVVVPAHKGIGGNEQVDKLVSSGIRKV^LDGIDKAQEEHEKYWSN^^ 

DKCQLXGEAMHGQV^CSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYFILXLAGRWPV 

KV1 HTD NGS N FTSAAVKAACWWAG 1 Q QEFG I PYNP Q ^ Q ^^J^^^ ^^^*!^1L5SY5 P.9^S^ e!rZvw^ MAV 
FIHNFKRKGGIGGYSAGERIIDIIATDIQTKELQKQIIWQNFRVYYT^DSRDPIWKGPAKLXWKGEGAW 

1QDNSDIKWPRRKAKI1KDYGKQMAGADCVAGRQDED 




A con. gag (subtype B consensus gag. The amino acid sequence is different from Los Alamos Database August 2002) 

GCCGCCGCCATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCGAGCTGGA 

CCGCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGAAGTACAAGC 

TGAAGCACATCGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCGTGAAC 

CCCGGCCTGCTGGAGACCTCCGAGGGCTGCCGCCAGATCCTGGGCCAGCT 

GCAGCCCTCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAACA 

CCGTGGCCACCCTGTACTGCGTGCACCAGCGCATCGAGGTGAAGGACACC 

AAGGAGGC CCTG GAGAAGATCGA GGAG GAG CAGAACAAGTC CAAGAAGAA 

GGCCCAGCAGGCCG CCGCCGACAC CGGGAACTC CTCCCAG GTGTCCCAGA 

ACTACCCCATCGTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATC 

TCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTT 

CTCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGC GCCACC C 

CCCAGGACCTGAACACCATGCTGAACACCGTGGGCGGCCACCAGGCCGCC 

ATG C AGATG CTGAAGGAGACCATCAAC GAG GAGG CC G CCGAGTGG GACCG 

CCTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGC 

CCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAGATC 

GGCTGGATGAC CAACAACCCCCCCATCCCC GTGG GCGAGATCTACAAGC G 

CTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGTACTCCCCCACCT 

CCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTG 

GACCGCTTCTACAAGACCCTGCGCGCCGAGCAGGCCTCCCAGGAGGTGAA 

GAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCC CGACTGCA 

AGACCATC C TGAAG GC CCTGGG CC C CGCCGC CACCCTG GAG GAGATGATG 

ACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCAGAAGGCCCGCGTGCTGGC 

CGAGGCCATGTCCCAGGTGACCAACTCCGCCACCATCATGATGCAGCGCG 

GC AACTTCCGCAAC CAGCGCAAGACCGTGAAGTGCTTCAACTGC GGCAAG 

GAGGGCCAGATCGCCAAGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTG 

GAAGTGCGGCAAGGAGGGC CACCAGATGAAGGACTGCACCGAGCG C CAGG 

CCAACTTCCTGGGCAAGATCTGGCCCTCCCACAAGGGCCGCCCCGGCAAC 

TTCCTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGAGGAGTCCTTCCG 

CTTCGGCGAGGAGACCACCACCCCCTCCCAGAAGCAGGAGCCCATCGACA 

AG GAG CT GTAC CCC CTGG C CTC C CTG CGCTC C CTGTTCG GC AAC GAC C CC 

TCCTCCCAGTAA 



\ 



B.con.env (subtype B consensus env. The amino acid sequence is different from Los Alamos Database August 2002) 

G CCGC CGCCATGCGCGTGAAGGGCATCCGCAAGAACTACGAGCACCTGTG 

GCGCTGGGGCACCATGCTGCTGGGCATGCTGATGATCTGCTCCGCCGCCG 

AGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCC 

ACCACCACCCTGTTCTGCGCCTCCGACGCCAAGGCCTACGACACCGAGGT 

GCAGAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCC 

AGGAG GTG GTGCTGGAGAACGTGACC GAGAACTTCAACATGTG GAAGAAC 

AACATGGTGGAGCAGATGCACGAGGACATCATCTCCCTGTGGGACCAGTC 

CCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCA 

C C GAC CTGAA GAAC AAC CTGCTGAACAC CAACTCCTC CTC CGGCGAGAAG 

ATGGAGAAGGGCGAGATCAAGAACTGCTCCTTCAACATCACCACCTCCAT 

CCGCGACAAGGTGCAGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGG 

TGC CCATC GACAAC AACAAC AACAC CTCCTACC GC CTGATCTC CTGCAAC 

AC CT CC GTGATCAC CCAGG CCTGC CCC AAGGTGTC CTTCGAGC CCATCCC 

CATCCACTACTG CGCCCCCGC CGGCTTC6CCATCCTGAAGTGCAACG ACA 

AGAAGTTCAACGGCACCGGCC CCTGCACCAACGTGTCCACCGTGCAGTG C 

AC CCACG GCATC CGC C C CGTG GTGTCCACC CAGCTGCTGCTGAAC G GCTC 

CCTGGC C GAG GAG GAGGTG GTGATC CG CTC C GAGAACTTCACCGACAAC G 

CCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACC 

CGCCCCAACAACAACACCCGCAAGTCCATCCACATCGGCCCCGGCCGCGC 

CTTCTACACCACCGGCGAGATCATCGGCGACATCCGCCAGGCCCACTGCA 
ACATCTC C CG CG CCAAGTG GAACAACAC CCTG AAG CAGATC GTGAAGAAG 

CTG C G CGAG CAGTTCGGCAACAAGAC CATC GTGTTCAACCAGTC CTC C GG 

CGGCGACCCCGAGATCGTGATGCACTCCTTCAACTGCGGCGGCGAGTTCT 

TCT AC TG CAACACCACCCAG CTGTTCAACTC CACCTG GAACGACAAC GG C 

ACCTG GAACAACAC CAAGGACAAGAACACCATCAC CCTGC C CTG C C GCAT 

CAAGCAGATCATCAACATGTGGCAGGAGGTGG GCAAGGCCATGTACGC C C 

CCCCCATCCGCGGCCAGATCCGCTGCTCCTCCAACATCACCGGCCTGCTG 

CTGACCCGCGACGGCGGCAACAACAACAACGACACCGAGATCTTCCGCCC 

CGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACA 

AGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGC 

CGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCATGTTCCT 

GGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATGACCC 

TGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGAAC 

AACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGT 

GTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACC 

TGAAGGAC CAGCAG CTG CTGG GCATCTG GG G CTG CTCC GG CAAG CTGATC 

TGCACCACCACCGTGCC CTGGAACGCCTCCTGGTC CAACAAGTCC CTG GA 

CGAGATC TGG GACAACATGAC CTGGATG GAGTGG GAGC G C GAGATC GACA 

ACTACAC CTC CCTGATCTACAC CCTGATC GAGGAGTC C CAGAAC CAG CAG 

GAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGG GCCTCC CTGTG 

GAACTGGTTCGACATCACCAACTGGCTGTGGTACATCAAGATCTTCATCA 

TGATCGTGGGCGGCCT6ATCGGC CTGCGCATCGTGTTCGC CGTGCTGTCC 

ATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCG 

G C GG CGAGCGC GAC CGCGACCG CTCC GGCCGCCTGGTGGACGGCTTC CTG 

GCCCTGATCTGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCG 

CCTGCGCGACCTGCTGCTGATCGTGACCCGCATCGTGGAGCTGCTGGGCC 

GCCGCGGCTGGGAGGTGCTGAAGTACTGGTGGAACCTGCTGCAGTACTGG 

TCCCAGGAGCTGAAGAACTCCGCCGTGTCCCTGCTGAACGCCACCGCCAT 

CGCCGTGGCCGAGGGCACCGACCGCGTGATCGAGGTGGTGCAGCGCGCCT 

GCCGCGCCATCCTGCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGC 

GCC CTGCTGTAA 
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Year* 2000 Con-S 140CFI.Env 

MRVRG I QRNCQHIiWRWGTL I LGMLM I C S AAEI^WVTVYYGV PVWKEANTTL FC AS DAKAYDTEVH 
NVWATHACVPTDPNPQE IVLENVTENFNMWK^I^I^IVEQMHED 1 1 SLWDQS LK PC VKLTPLC VTLNC 
TNVNVTNTTNNTEEKGEI KNC S FN I TTE I RDKKQKVYALF YRLDVVP I DDNNNNS SNYRL INCNT 
SAITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCKNVSTVQCTHGIKPVVSTQLLiLiNG 
SLAEEEIIIRSENITNNAKTIIVQIjNESVEINCTRPNNNTRKSIRIGPGQAFYATGDIIGDIRQA 
HCN I SGTKWNKTLQQVAKKLiREHFNNKT 1 1 FKP S SGGDLE I TTHS FNCRGE FF YCNTSGLFNSTW 
IGNGTKNNNNTI^^ 

ETEIFRPGGGDMRDNVTOSELYKYKVVKIEPLGVAPTKAKLTVQARQLLSGIVQQQSNLIiRAIEAQ 

QHLLQLTVWGIKQLQARVLAVERYLKDQQTyEIVTO 

NEQELLALDKWASLWNWFDITNWLW 

A gp140 CFI is referred to HIV-1 envelope design with the cleavage-site-deleted (C), fusion-site-deleted 
(F) and gp41 immunodominant region-deleted (I) in addition to the deletion of transmembrane and 
cytoplasmic domains. 

Codon-optimized Year 2000 Con-S 140CFI. seq 

ATGCGCGTGCGCGGCATCCAGCGCAACTGCCAGGACCTGTGGCGCTGGGGCACCCTGATCCTGGG 
CATGCTGATGATCTGCTCCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGT 
GGAAGGAGGCCAACACCACCCTGTTCTGCGCCTCCGACGCCAAGGCCTACGACACCGAGGTGCAC 
AACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATCGTGCTGGAGAA 
CGTG AC CG AGAAC TTC AAC ATGTGG AAGAAC AAC ATGGTGG AGC AG ATGC ACG AGG AC ATC ATCT 
CCCTGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGC 
ACCAACGTGAACGTGACCAACACCACCAACAACACCGAGGAGAAGGGCGAGATCAAGAACTGCTC 
CTTCAACATCACCACCGAGATCCGCGACAAGAAGCAGAAGGTGTACGCCCTGTTCTACCGCCTGG 
ACGTGGTGCCCATCGACGACAACAACAACAACTCCTCCAACTACCGCCTGATCAACTGCAACACC 
TCCGCCATCACCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCCATCCCCATCCACTACTGCGCCCC 
CGCCC^CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACG 
TGTCCACCGTGCAGTGCACCCACGGCATCAAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGC 
TCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAACATCACCAACAACGCCAAGACCATCAT 
CGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGTCCA 
TCCGCATCGGCCCCGGCCAGGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCC 
CACTGCAACATCTCCGGCACCAAGTGGAACAAGACCCTGCAGCAGGTGGCCAAGAAGCTGCGCGA 
GCACTTCAACAACAAGACCATCATCTTCAAGCCCTCCTCCGGCGGCGACCTGGAGATCACCACCC 
ACTCCTTCAACTGCCGCGGCGAGTTCTTCTACTGCAACACCTCCGGCCTGTTCAACTCCACCTGG 
ATCGGCAACGGCACCAAGAACAACAACAACACCAACGACACCATCACCCTGCCCTGCCGCATCAA 
GCAGATCATCAACATGTGGCAGGGCGTGGGCCAGGCCATGTACGCCCCCCCCATCGAGGGCAAGA 
TCACCTGCAAGTCCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAACAACAACACCAAC 
GAGACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAA 
GTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCTTACCGTGCAGG 
CCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAG 
CAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGA 
GCGCTACCTGAAGGACCAGCAGCTCGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCG 
AG ATC AAC AACTAC ACCGAC ATC ATC TAC TC CCTG ATCG AGG AGTCCC AG AAC C AGC AGG AG AAG 

AACGAGCAGGAGCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCGACATCACCAA 
CTGGCTGTGGTGAGGATCC 



. Design of express ion- optimized HIV-1 envelope gpl40CF 

Con-B-2003 Env . pop (841 a.a.)* 

MRVKG I RKNYQHJjWRWGTMLLiGMLMI C SAAEKLWVTVYYGV PVWK£ATTTLFCA.SDAKAYDTEVHNVWATHACVPTDPNPQE^VIj 

ENVTENFNMWKNNM\reQMHEDIISLVTOQ 

AIJ^KLDVX^IDNDNTSYRLISCNTSVITQA 

LIjLNGSIAEEEVVIRSENFTDNAKTIIVQIiNESVEINCTRPNNNTRKSIHIGPGRAPYTTGEIIGDIRQA 
IVKKLREQFGNKTIVFNQSSGGDPEIVMHSFNCGGEFFYCNTTQLFN^ 

I RGQIRC S SNI TGLLLTRDGGNNETE I FRPGGGDMRDNWRS ELYKYKWKI EPLGVAPTKA KRRWOREKRAVGIGAMFIXSFLGA 
AGSTMGAASMT LTVQARQLLSG I VQQQNNL1L1RAI EAQQHLLQLTVWGIKQliQARVLAVERYLKDQQLLtG IWGC SGKLI CTTAVPW 
NASWSNKSLDEIWDNMTWMEWEREIDNYTSLIYTLIEESQNQQEI^EQELI/ELDKWASLWNWF 
RIVFAVLSXVNRVRQGYSPLSFQTRIjPAPRGPDRPEGIEEEGGERDRDRSGRIj\hX5FLALIWDDIj 
IVELLGRJIGWEVLKYWWI^LQYWSQELKNSAVSIjLNATAIAVAEGTDRVI EWQRAC RAI LH I PRRI RQGLiERALiLf 
♦Amino, acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the "W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the a W" will be deleted in 140CF 
design . 

Con-B~140CF.pep (632 a. a.) 
Nick name: 002 

MRVKGIRKNYQHliWRWGTMLiLfGMLiMI C SAAEKLWVTVYYGV PVWKEATTTLFC ASDAKAYDTEVHNVWATHACVPTDPNPQEVVL 
ENVTENFNMWKNNMVEQMHEDIISLWD^ 

ALFYKLDVVPIDNDNTSYRLISCNTSVITQACPKVSFEPIPIHYCAP 

LIJ^GSLAEEEVVIRSENFTDNAKTIIVQLNESVEINCTRPNNN^ 

IVKKLREQFGNKTIVFNQSSGGDPEIVMHSFNCGGEFFYO^ 

IRGQI RCSSNITGLLLTRDGGNNETEIFRPGGGDMRDNWRSEliYKYKVVKI EPLGVAPTKAKTLTVQARQIiIiSGIVQQQNNlilJtA 
ZEAQQHXJiQXjTVWGXKQX^ARVXA^^ 

*Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

Codon-opitmized Con-B 140CF . seq (1927 at - ) 
Nick name: 002 

TTCAGTCGACGGCCACCATGAGGGTGAAGGGTATTCGGAAAAATTACCAACACCT 
GCTGATGATTTGCAGTGCCGCCGAGAAACTTTGGGTAACTGTGTACT^ 

TTTTGTGC ATCC GAC GCT AAAGCTTACGAC AC AG AAGTGCATAATGTTTGGGC CAC C C ATGC TTGC GTCCCTAC AGATC C C AAC C 
CCC AGGAAGTC GTCC TTGAGAATGTC AC AG AGAATTTTAAC ATGTGGAAGAATAATATGGTAG AAC AAATGC AC GAAGAC ATT AT 
TAGCCTGTGGGACCAGTCCTTGAAGCCCTGCGTGAAACTCACTCCACTTTGCGTCACACTO 

ACCAACACAAATACTACTATTATATATCGCTGGAGGGGGGAAATCAAGAACTGCTCTTTCAACATCACCACTTCCATAA 

AGGTCCAGAAAGAATATGCCCTGTTTTATAAACTTGATGTGGTCCCGATAGACAATGACAACACTAGCTATCGACTGATCTCTTG 

TAACACATCCGTGATTACCCAAGCTTGCCCAAAGGTCAGCTTTGAACCAATACCCATTCAC 

ATCCTCAAGTGTAACGACAAAAAATTCAATGGGACCGGACCTTGCACAAACGTGTCCACCGTGCAATG 

CTGTTGTCAGTACCCAACTCCTCTTGAACGGGTCTCTCGCGGAAGAGGAGGTCGTGATTAGAAGC 

TAAAACAATCATTGTGCAACTTAATGAAAGCGTCGAAATTAACTGCACCAGACCAAACAATAA 

GGGCCCGGCCGCGCATTTTATACAACTGGCGAAATCATTGGTGACATCAGACAAGCTCATTGCAATATCTCCCGCGCG 

AC AAC ACC CTGAAAC AGATC GTG AAGAAACTTC GAGAAC AATTCGGTAATAAAAC AATC GTATTCAAC C AAAGC TC CGGAGGC GA 

CCCTGAGATAGTTATGCACTCATTCAACTGTGGCGGCGAGTTCTTCTATTG 

GG AAC ATGGAAC AAC AC AGAAGGG AAC ATCACTCTGCC TTGTCGGATTAAGC AGATC ATT AAT AAGAAGTGGGAAAAG 

CTATGTACGCCCCGCCTATTCGCGGACAAATAAGATGCTCTAGTAATATTACCGGATTGTTGCT^ 

TGAAACAGAGATATTTAGACCTGGCGGAGGCGACATGAGAGATAACTGGAGAAGTGAGCTTTACAAATATAAA 

GAACCATTGGGGGTAGCACCAACCAAAGCAAAAACCTTGACAGTACAGGCTAGGCAGCTGCTGAGCGGAATCGTGCAACAAC 

ATAATCTTCTCCGAGCCATAGAAGCACAACAACATCTGTTGCAGCTGACAGTATGGGGAATCAAACAG^ 

GGCCGTCGAGAGATACCTCAAAGATCAACAACTGCTGGGCATATGGGGATGTTCCGGTAAACTCATAT^ 

TGGAAC GC G AGC TGGTC TAATAAATC C C TGGATGAAATTTGGGAC AAC ATGAC TTGGATGGAATGGGAAC GGGAAATTGAC AACT 

ATACTAGTTTGATTTATACTCTGATCGAAGAATCTCAGAACCAACAGGAGAAAAACGAACA 

GGCATCATTGTGGAACTGGTTTGACATTACTAACTGGCTGTGGTAAAGATCTTACAA 

(For all 140CF design shown here and below, 140CF gene will be flanked with the 5' 
sequence of "TTCAGTCGACGGCCACC" that contains a Kozak" sequence ( GCCACCATGG / A ) and 
SalX site and 3 'sequence of TAAAGATCTTACAA containing stop codon and B&1XZ site.) 

CON_OF_CON- S - 2 0 0 3 (829 a. a.) 

MRVMGIQRNCQHLWRWGILIFGMLIICSAAENLWVTVYYGVPVWKEAOT 

ENVTENFNMWKNNMVEQMHEDI I SLWDQSLKPCVKLTPLCVTLNCTDVNATNNTTNNEEI KNC S FNITTE I RDKKKKVYALFYKL 
DVVPIDDNNSYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCK^ 
AEEEIIIRSENITNNAKTIIVQLNESVEINCTRPNNNTRKSIRIGPGQAF 



i 



HFNKTIIFNPSSGGDLEITTHSFNCGGEFFYCNTSELFNSTWNGTNN^ 

CZT .T .T .TR T^GWNNTETFRPGGGDMRDNWRS ELYKYKVVKI EPIX^APTKA KRRVVEREKRAVGIG AVFX «G FLGAAGSTMG AAS ITL 

TVQARQLLSGIVQQQSNLLRAIEAQQHLLQL^^ 

VTONMTWMEWDKEINNYTDI I YSLIEESQ 

VRQGYSPLSFQTLIPNPRGPDRPEGIEEEGGEQDRDRSIRLVNGFIjAI^WDDLiRSLiCLFSYHRIjRDIj 
LKYXjWNIiIiQYWGQELKNSAI SLLDTTAI AVAEGTDRVI EWQRVC RAILNI PRRI RQGFERALL 

* Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the **W* underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the will be deleted in 140CF 

design. 

CON-S-2003 140CF.pep (620 a.a. ) . 
Nick name: 006 

MRVMG I QRNCQHLWRWG I LI FGML, 1 1 C S AAENLWVTVYYGVPVWKEANTTLF C AS DAKAYDTEVHNVWATHACVPTD PNPQE IVX» 
ENWENFNMWKNNMVEQMHEDIISL^ 

DWPIDDNNSYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKC1TOKKFNGTGPCKNVSW 

AEEEI 1 1 RSENITNNAKTI I VQLNESVEINCTRPNNNTRKS I RIGPGQAFYATGDI IGDIRQAHCNI SRTKWNKTLQQVAKKLRE 

HFITCTIIFNPSSGGDIiEITTHSFNCGGEFFYCNTSELFNSTWNGT^^ 

GLLLTRDGGNNNTETFRPGGGDMRDNWRSELYK^ 

WGIKQLQARVTAV^YLKDQQLIXSIWGCSGKIjICra 

NEQEX*LAX*r>KWAS LWNWFDITNWLW * 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON-OPTXMIZED CON-S-2003 140CF.sea (1891 nt . ) 
'Nick name :006 

TTCAGTCGACAGCCACCATGCGGGTCATGGGGATACAGAGGAATTGCCAGCACTTGTGGAG 

GCTC ATAATCTGCTC TGCCGCTGAGAACCTGTGGGTC AC TGTGTATTACGGCGTTCCC GTCTGGAAAGAAGC TAATACTACC CTG 
TTTTGTGC AAGCGAC GCC AAAGC ATACGAC ACCGAAGTC C AC AATGTCTGGGCTACCC ACGCCTGTGTACCTAC TGATCC AAATC 
CCC AGGAAATTGTTC TTGAAAACGTAACGGAAAACTTTAAC ATGTGGAAGAATAATATGGTGGAGC AAATGC AC GAGGATATAAT 
C AGC C TGTGGGAC C AGTC C C TC AAAC C ATGC GTT AAAC T C AC TC C AC TC TGC GTG AC TC TG AAC TGT AC C GAC GTGAAC GC AAC C 
AATAATACAACAAACAATGAGGAGATAAAGAATTGTTCATTTAATATAACCACTGAGATACGGGATAAGAAAAAAAAGG 
CACTCTTTTACAAGCTCGACGTGGTGCCCATAGACGACAATAATAGCTACCGACTCATTAATTGCAATACTAGCGCTATAACCCA 
GGCATGCCCCAAAGTTTCCTTCGAGCCCATACCGATTCACTACTGCGCACCC^ 

AAGTTCAACGGCACCGGACCCTGTAAGAACGTAAGCACTGTTCAATGTACACATGGAATTAAGCCGGTAGTGTCAACGCAGCTCC 

TCCTCAACGGAAGCCTTGCAGAAGAAGAGATCATTATCAGGTCAGAAAATATCACTAACAACGCGAAAACAATC 

GAATGAGTCTGTAGAAATCAATTGTACCCGCCCTAATAATAACACAAGA7VAGTCAATTAGGATCGGACCCGGCCAGGCTTTCTAC 

GCAACCGGAC^TATCATCGGGGATATACGACAGGCCCACTGCAACATTTCTAGAACTAAGTGGAATAAAACTTTC 

CCAAGAAACTGCGGGAACATTTTAATAAGACAATCATCTTCAATCCAAGTAGCGGAGGGGACCTGGAAATCACTACACATT^ 

TAAC TGTGGGGGCGAGTTTTTCTACTGTAATACCTC TGAACTGTTC AACTC AAC ATGGAATGGC AC TAACAATACTATAACTCTT 

CCTTGCAGAATAAAACAGATTATCAACATGTGGCAGGGTGTGGGGCAAGCAATGTATGCACCACCAATCGAAGGCAAAATAAGAT 

GC ACCTCC AATATTACCGGACTC CTC CTGAC ACGGGATGGCGGAAAC AATAAC ACGGAGACCTTTAGGCC AGGCGGCGGCGATAT 

GAGAGATAACTGGCGCTCCGAGCTCTATAAATACAAAGTCGTTAAGATCGAGCCCCTTGGAGTTGCGCCAACCAAAGCTAAAACC 

TTGACCGTGCAAGCCAGGCAGTTGTTGTCAGGTATCGTACAGCAGCAATCTAATCTTTTGAGAG 

TCTTGCAGCTTACCGTCTGGGGCATCAAACAACTTCAGGCACGCGTCCTGGCCGTAGAGCGCTAT^ 

CGGGATCTGGGGGTGTTCTGGAAAATTGATCTGC AC GAC AAATGTGC CTTGGAAC AGC AGCTGGTC AAATAAAAGCC AAGACGAA 
ATATGGGATAAC ATGAC ATGGATGGAATGGGAT AAAG AAATTAAT AATT AC AC TG AC ATTATTTAC TC AC TTATCG AGGAATC AC 
AAAATCAACAGGAAAAAAATGAACAGGAACTCTTGGCTCTGGACAAATGGGCTTCACTGTGGAACTGG 
GCTCTGGTAAAGATCTTACAA 

CONSENSUS_A1-2003(845 a. a.) 

MRVMGIQRNCQHLIiRWGTMIliGMI I ICSAAENLW\mrrrGVPWKDAETTIiFCASDAKAYETE^^ 
ENVTEEFNMWKNNM^QMOT^ 

RLDWQINEl^SNSSYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCKDKEFNGTGPCKWSTVQCTHG 
LNGSLAEEEVIIRSE^ITNNAKTIIVQLTKPVTCINCTRPNNNTRKSIRIGPGQAFY 

KQLRKYFKNKTIIFTNSSGGDLEITTHSFNCGGEFFYCNTSGLFNSTWNNGTMKNTITLPCRIKQIINMW 
IRCESNITGIiLLTRDGGNI^TNETFRPGGKjDMRDNWl^SEIjYKYTCVA/KIEPLGV 

TMGAAS I TLTVOAROLLSG1 VQQOSMLLRAI EAQQHLLKLTVWG I KQLQARVLAVERYLKDQQLLG IWGC SGKL I CTTNVPWNS S 
WSNKSQITOIWDNMTWLQWXJKEISNV^ 

FAVX.SVINR\mQGYSPLSFQTHTPNPRGLDRPGRIEEEGGEQGRDRSIRLVSGFLALAWDDLRSLCL 
LIX3HSSLKGLRLGWEGLKYLWNLLLYWGRELKISAIOT 

*Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the "W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the *W will be deleted in 140CF 
design. 
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Con-Al-20 03 140CF.pep (629 a. a. ) 
Nick name: 001 

MRVMG I QRNCQKLLRWGTM I LGMI I IC S AAENLWVTTVYYGVPVWKDAETTLFCASDAKAYETEMHNXA^ATHACVPTDPNPQE IH L 
ENVTEEFNMWKNNMVEQMHTDIISLWDQSLKPCV^ 

RLDWQINENNSNSSYRLINCOTSAITQACPKVSFEPIPIHYCAPAGFAILKC 

LNGSLAEEEVI IRSENITNNAKTI IVQLTKPVKINCTRPNNNTRK.SIRIGPGQAFYATGDI IGDIRQAHCNVSRSEWNKTLQKVA 
TOLRKYFKNKTIIFTNSSGGDLEITTHSFWCC^EFFYCNTSGL^ 

IRC ESNITGLiL»L»TRDGGNNNTNETFRPGGGDMRDNWRS ELYKYKVVKIE PLGVAPTRAKTIjTVQARQI^SGXV'QQQSNIiIjRAIKA 
QQHLLKLTVWG IKQLQARVLtAVERYIiKDQQIjLfGXWGC S GKLI CTTNVPWNS SWSNKSQNB IWDNMTWLQWDKE I SNYTHI IYNLI 
EE S QNQQKKNE QDT.T.AT.DKWANLWNWFDI SNWLW * 

*Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON-OPTXMXZ2D Con-Al-2003 . seq 
Nick name: 001 (1918 nt) 

TTC AGTC G AC AGC C ACC ATG AGGGTG ATGGG AATC C AAC GG AAC TGC C AGC ATC TTC TC C GGTGGGG AAC G ATG AT AC TGGG AAT 
GATAATAATCTGCTCTGCCGCTGAAAACCTCTGGGTCACAGTGTACTACGGAGTGCCTGTATGGAAGGACGCTGAAA 
TTTTGTGCTTCCGATGCTAAAGCCTACGAAACCGAGATGCACAATGTTTGGGCCACCCACGCCTGCGTGCC 
CACAAGAAATACATCTGGAGAATGTTACTGAGGAATTTAACATGTGGAAAAATAATATGGTAGAGCAAATC 

TTC ACTCTGGGACCAATC ACTCAAACCCTGCGTTAAACTTACCC CCCTCTGC GTGACCCTCAATTGTAGC AACGTC AACGTC ACA 
AATAATAC AACC AAC ACTC AC GAGGAAGAAATTAAAAATTGCTC CTTT AATATGAC C AC TGAACTTC GC GAC AAAAAAC AAAAAG 
TC TATTC ACTGTTTTATAGGCTGGACGTCGTCC AAATC AACGAGAAC AATTCTAAC AGTAGC TATC GACTTATC AATTGC AATAC 
CTCTGCTATTACC CAAGCTTGTCCTAAAGTCTC TTTTGAACCAATCCCTATCCAC TACTGTGCCCC AGCTGGATTCGC AATTCTG 
AAGTGC AAGGATAAGGAATTC AAC GGAACTGGC CCTTGC AAGAAC GTTAGC ACTGTCC AATGC ACTC ACGGAATC AAACC AGT AG 
TCAGCACTCAACTGCTCCTGAATGGCTCACTCGCCGAAGAAGAGGTGATTATCCGA^ 
AATAATTGTTCAATTGACGAAACCAGTGAAGATCAACTGTACTAGACCAAATAACAACACAA 
GGACAAGCCTTCTACGCAACAGGAGATATCATAGGTGACATCAGACAGGCCCATTGCAACGTTT^ 

C ACTC CAGAAAGTGGCAAAGC AGC TG AGAAAATAC TTTAAGAAC AAG AC AATC ATATTT AC TAAC TC C TC C GGAGGTG ATC TC GA 
AATAACCACTCATAGCTTTAATTGTGGGGGCGAATTCTTCTACTGTAACACATCTGGCCTCTTTAATTCTACCTGGAATAACGGC 
ACCATGAAAAATACTATCACCCTCCCTTGCAGAATTAAGCAAATCATTAACATGTGGCAGAGAGCAGGACAGGCCATC 
CTCCCATTC AAGGTGTGATTCGATGTGAAAGCAAC ATTAC TGGACTTCTTCTGACC C GGGATGGCGGAAATAATAATACC AATGA 
GAC ATTC AGACCC GGC GGC GGC G ATATGCGAGAC AATTGGC GAAGTGAAC TTTATAAATAC AAAGTAGTTAAGATTG AGC C C C TT 
GGAGTTGCCCCTACTAGAGCAAAAACATTGACCGTTCAGGCCAGGCAGCTGCTCTCA 

TCCGAGCTATCGAGGC AC AAC AAC ATCTCTTGAAATTGACCGTATGGGGAATC AAGC AATTGC AGGC TAGGGTTTTGGC TGTGGA 
ACGCTATCTCAAGGATCAGCAGCTTCTGGGAATCTGGGGATGCTCTGGGAAATTGATATGTACTACAA^ 

AGC TGG AGT AAT AAAAGC C AG AAC G AAATTTGGG AT AAT ATG ACC TGGC TGC AGTGGG AC AAAG AAATTTC TAATT AT AC TC ATA 
TC ATATAC AATCTGATC GAAGAATC ACAGAACCAGC AGGAAAAGAATG AGC AAG ACC TTC TGGCCTTGGAC AAGTGGGC TAACTT 
GTGG AACTGGTTTGAC ATT AGC AAC TGGC TGTGGTAAAGATCTTACAA 

31 

CONSENSUS_C-200 3 (835 a. a) 

MRWGILRNCQQWWIWGILGFWMLMICNWGNLWVTVYYGV 
ENOTENFNMWKNDMVDQM^ 

VPLNENNSYRLINCOTSAITQACPWSFDPIPXHYCAPAGYAILKCNNKTFNGTGPCNN^ 

EEIIIRSENLT1TOAKTIIVHLNESVEIVCTRPNNNTRKSIRIGP 

PNKTIKFEPSSGGDLEITTHSFNCRGEFFYCNTSKLFNSTYNSTNSTITLPCR 

LLLTRDGGKNNTET FRPGGGDMRDNWRS ELYKYKWE IKPLG 1 APTKA KRRWEREKRAVG I GAVFLGFLGAAG STMGAAS 1 TLT 
VQARQIiLSGIVQQQSNLLRAI EAQQHMLQLTVWGI KQLQTRVLAI ERYLKIX3QLLGIWGCSGKLICTTAVPWNSSWSNKSQEDIW 
DNMTWMQWDREISNYTDTIYRLLEDSQNQQEKNEKDDLAL 

RQGYSPLSFQTLTPNPRGPDRIXSRIEEEGGEQDRDRSIRLVSGFLJU^WDDL^ 

QRGWEALKYLGSLVQYWGIjEIjKKSAISLLDTIAIAVAEGTDRIIELIQRICRAIRNIPRJIIRQGFEA^ 
♦Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the W W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the M W will be deleted in 140CF 
design, . 

Con-C 2003 140CF.pep (619 a. a.) 
Nick name: 003 

MRVRGILRNCQQWWIWGILGFWMLMICNVVGNLW 

ENVTENFNMWKNDMVDQMHEDI I SLWDQSLKPCVKLTPLC VTLNCTNATNATNTMGEI KNC SFNITTELRDKKQKVYALFYRLDI 
VPLNENNSYRLINCNTSAITQACPKVSFDPIPIHYCAPAGYAIIjKCNbn^TFNGTGPCNNVSTVQCTHGIK 

EEI I IRSENI/TNNAKTI TVHLNESVEIVCTRPNNNTRKS I RIGPGQTFYATGDI IGDI RQAHCNISEDKWNKTLQKVSKKL.KEHF 

PNKTIKFEPSSGGDLEITTHSFNCRGEFFYCNTSKLFNSTYNSTNSTITLPCRIKQIINMWQE^GRAMYAPPIAGNITCKSNITG 

LLLTRDGGKNNTET FRPGGGDMRDNWRSELYKY*^^ 

GIKQI<QTKVIAIKRYIiia>QQIjIjGIWGCSGKI«ICTTAV 

EKDI^LALDSWKNLWNWFDITNWLW* 



♦Amino acids seen in blue color is for easy identification of the junction of the , 
deleted fusion cleavage site. 

^ CODON- OPTIMIZED Con-C-2003 140CF (1,888 nt. ) 
— * Nick name : 003 

TTC AGTC GAC AGC CAC C ATGC GAGTGAG AGGC ATTCTGC GGAATTGTCAGC AATGGTGGATCTGGGGC ATAC TC GGATTC TGGAT 
GC TT ATGAT ATGC AATGTTGTGGGGAAC C TGTGGGTT AC C GTATAC T ATGGGGTTC C AGTC TGGAAGGAGGC T AAAAC AAC GC TG 
TTCTGTGCAAGTGACGCCAAAGCCTACGAGAAAGAAGTGCACAACGTCTGGGCTACCCACGCTTGTC 

C C C AGGAAATCGTC C TC GAGAACGTG AC TG AAAAC TTTAAC ATGTGG AAGAATG ATATGGTAGATC AGATGC AC GAAGATATC AT 
TTCATTGTGGGACCAATCATTGAAACCATGCGTAAAACTGACCCCCCTCTGCGTAACACT^ 

AC C AATAC TATGGGC GAAAT AAAAAAC TGTAGC TTTAAC ATT AC AAC GGAAC TC C GGGATAAGAAAC AAAAGGTCTAC GC GC TC T 
TTTACCGACTCGATATCGTCCCACTTAACGAGAATAATAGTTACCGCCTGATTAACTGTAACACATCAGCCATTACGCAAGCTT^ 
CCCCAAAGTTTCTTTCGACCCCATCCCAATTCACTATTGTGCCCCCGCTGGATACGCTATACTTA 

AATGGAACCGGACCATGTAACAACGTCAGTACCGTACAATGTACGCACGGAATTAAACCTGTTGTCTCAACCCAGCTTCTC 
ACGGC TC ATTGGC GGAGG AAGAAATTATTATC AG ATC AGAAAAC TTGAC C AAC AATGC C AAAAC C ATC ATCGTGCACCTC AATG A 
ATCCGTGGAAATCGTGTGCACCAGACCAAATAACAATACCCGGAAATCAATCAGGATTGGGCCTGG 

GGTGATATAATTGGC G ATATTAGAC AAGC CC ATTGC AAC ATATC AGAAG AC AAGTGGAAT AAGAC TCTGC AGAAGGTTTC TAAGA 
AGCTGAAGGAACACTTTCCCAATAAAACGATTAAGTTCGAGCCCTCTTCAGGAGGAGACCTTGAGATCACAACAC 
TTGTAGAGGGGAGTTCTTCTATTGTAATACATCAAAGCTCTTTAACAGTACCTACAACTCCACTAATAGTACC 
TGCAGAATAAAGCAAATAATCAACATGTGGCAAGAAGTTGGCCGAGCAATGTACGCCCCTCCCATCGCAGGC 

AATC C AATATT AC TGGCC TTTTGCTGAC ACGGGAC GGC GGAAAGAAT AAC AC TGAG AC C TTC AGAC CTGGC GGAGGC GAT ATGC G 
CGATAATTGGCGGAGCGAGCTCTACAAGTATAAAGTCGTTGAAATCAAGCCACTGGGCATAGCTCCTACGAAAGCAA 
ACTGTTC AGGC TAGAC AGC TGC TC TC C GGC AT AGTGC AAC AGC AATC C AATC TCCTGCG AGC T ATC GAAGCCC AAC AACAT ATGC 
TC C AGC TTAC CGTCTGGGGAATC AAAC AATTGC AAAC AC GAGTGC TGGC G ATAG AGAGATATTTGAAAG ATC AGC AAC TC C TGGG 
GATTTGGGGCTGTTCAGGTAAGCTCATCTGTACAACTGCGGTGCCGTGGAACTCAAGCTGGAG 

TGGGACAACATGACTTGGATGCAGTGGGATCGAGAAATAAGCAACTATACAGATACCATTTATCGGCTCCTGGAGGACTCACAGA 
AC C AGC AGG AG AAAAATGAG AAAG ATTTGC TC GC GC TTGAC AGTTG GAAGAATTTGTGGAATTGGTTC GAC ATT AC AAAC TGGC T 
_ CTGGTAAAGATCTTACAA 

CONSKNSUS_G-2003 (842 a. a.) 

\ MRVKG IQRNWQHLWKWGTL I LGLVI I C S ASNNIjWVTVYYGVPVWEDADTTIjFC ASDAKAY STERHNVWATHACVPTD PNPQE I TL 
ENVTENFNMWKNNMVEQMHEDI I SLWDESLKPCVlCIjTPIXr\rrLNCTDVlJVTNNNTNNTKKE I KNC S FN ITT EI RDKKKKEYALFY 
RLDVVPINDNGNSSIYRLINCNVSTIKQACPKVTFDPIPIHYCAPAGFAILKCRDKKFNGTGPCKN^ 
LNGSIiAEEEIIIRSENITDNTKVIIVQLNETIEINCTRPNNNT^ 

AQLKKIFNKSITFNSSSGGDLEITTHSFNCRGEFFYCNTSGLFNNSLLNSTNSTITLPCKIK^ 

CRSNITGLLLTRDGGNNNTETFRPCXSGDMRDNWRSELYKY 

AASITLTVQvTRQLLSGIVQQQSNIjIjRAIEAQQHIjIjQIjTVWGIKQLiQARv^ 

KSYNEIVTONMTWIEl^REISNYTQQIYSLIEESQNQQEKNEQDLLAIjDKWASL^ 

LSIVNRVTIQGYSPLSFQTLTHHQREPDRPERIEEGGGEQDKDRSIRIiVSGFIjAIjAWDDL^ 

RSSLKGLRLGWEGLKYLWNIXLYWGQELKNSAIN^ 

♦Amino acid sequence underlined . is the fusion domain that will be. deleted in 140CF 
design and the n W* underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the tt W will be deleted in 140CF 
design . 
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Con-G-2003 140CF (626 a. a.) 
Nick name: 0O7 

MRVKGI QRNWQHLWKWGTL I LGLVI I C SASNNLWVTVYYGVPVWEDADTTLFCASDAKAYST ERHNVWATHACVPTD PNPQE I TL 
ENVTENFNMWKNNMVEQMHEDI I SLWDESLKPCVKLTPLCVTLNCTDVNVTNNNTNNTKKE 
RLDWPINDNGNSSIYRLINCNVSTIKQACPKVTFDPIPIHYCAPAGFAILK^ 

LNG S LAEEE III RS ENI TDNTKVI I VQLNET I E I NC TRPNNNTRKS I R I G PGQAF Y ATGD 1 1 GD I RQ AHCNV S RTKWNEMLQ KVK 

AQLKKIFTSnCSITFNSSSGGDLEITTHSFNCRGEFFYCNTSGLFNNSLI^STNSTITLPCKIKQIVRMWQRVGQAMYAPPIAGNIT 

CRSNITGLLLTRDGGNNOTETFRPGGGDMRDNWRSELYKYKIVK^ 

LLQI/TVWGIKQLQARVI^VERirL^^ 

QNQQEKNEQDLLALDKWASI-WNWFDITKWLW* 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site 

CODON-OPTTMT7.ro Con-G-2003 140CF.eeq 
Nick name: 007 

TTC AGTC G AC AGC CACC ATGC GAGTGAAGGG AATC C AG AGAAATTGGC AGC AC C TTTGGAAGTGG GGC AC AC TC 
TGTGATCATATGCTCTGCCTCAAATAACCTTTGGGTCACAGTTTATTACGGCGTGCC 
TTTTGTGCCAGCGACGCTAAGGCTTATTCAACAGAGAGGCATAACGTTTGGGCTACACAT^ 
CCCAGGAAATCACTCTTGAGAATGTTACAGAGAATTTTAATATGTGGAAG 

TTCTCTCTGGGATGAATCTCTGAAACCTTGCGTGAAGCTTACACCACTGTGCGTTACCCTGAATTGCAC 
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AATAATAATACC AAC AATAC AAAAAAAGAAATCAAAAATTGTTC TTTC AAC ATAACC ACC GAGATACGC GAT AAAAAAAAGAAAG 
AATAC GC C CTGTTC TACAGAC TC GATGTGGTCC C AATTAATGAC AAC GGAAATTCTTC C ATCTAC C GAC TTATC AATTGTAACGT 
GTCTACAATCAAACAGGCCTGTCCT7VAAGTCACATTTGACCCTATTCCCATTCATTACTGTGCCCCC 

AAATGC CGAGAC AAAAAATTTAAC GGAAC AGGAC C ATGC AAGAATGTC TC AAC AGTTC AATGC ACTCATGGAATTAAAC C AGTCG 
TTTCTACTCAACTCCTTCTCAATGGAAGCCTGGCAGAAGAGGAAATCATAATCCGCAGCGAAAACAT 

AATC ATCGTAC AGCTGAAC GAGACC ATTGAAAT AAATTGTAC GAG AC C TAATAATAAC AC AAGAAAAAGC ATAC GC ATC GGCCCC 
GG AC AGGC TTTC T AC GC C AC AGGAG AC ATTATCGGAGATATC C GC C AGGC TC AC TGTAATGTGTC TAGAAC AAAATGG AACGAAA 
TGCTTC AGAAGGTCAAAGC TC AGCTC AAGAAAATATTC AAC AAATCTATTAC ATTC AACTC ATC ATC AGGC GGCGATCTGGAGAT 
AACAACTCATTCCTTCAACTGTCGGGGAGAATTTTTTTACTGTAACACGTCCGGCC 
AACTCCACCATCACTCTCCCATGTAAGATCAAACAAATCGTCAGAATGTGGCAGCGAGTCGGT 

TCGCCGGTAATATCACATGTAGAAGCAATATCACAGGGCTCTTGCTTACAAGGGACGGCGGGAACAACAACACCGAAACCTTCAG 
*AC C AGGAGGAGGAGAC ATGC G AGAC AATTGGCGG AGC G AGCTGTATAAAT ATAAGATCGTAAAAATC AAAC C ATTGGGTGTAGCG 
CCAACTAGAGCCCGAACACTGACCGTGCAGGTGAGGCAACTGCTGAGCGGCATTGTCCAACAACAATCCAATC 
TCGAGGCCCAGCAGCATCTGCTCCAGCTTACTGTATGGGGAATCAAACAACTGCAAGCAAGAGTATTGGCAGTGGAGAGGTATC 
CAAGGACCAGCAGCTTCTGGGAATTTGGGGTTGCAGCGGAAAGCTCATATGTACAACCAATGTGCCCTGGAACACTAGTTGGAGT 
AATAAGAGTTACAATGAAATCTGGGAC AATATG AC ATGGATC GAATGGGAGCGCGAAATATC C AACTATACTC AGC AAATCTATT 
CCCTCATTGAAGAGAGTCAGAACCAGCAGGAAAAGAATGAGCAAGACCTCCTCGCCCTGGATAAATGGGCATC 
GTTTG AC AT AAC T AAATGGTTGTGGT AAAG ATC TT AC AA ' 

"33 

CONSENSUS_0 1_AE- 2003 (854 a. a.) 

MRVKETQMIWPNLWKWGTLIIjGIiVIICSASDNL^ 
ENVTENFNMWKNNMVEQMQEDVISLWIX3SLKPCVKL,T 

QKVHALFYTCLDIVQIEDNNSYTUjINCNTSVIKQACPKISFDPIPIHYCTPAGYAI 

STQLLLNGSLAEEEIIIRSENLTNNAKTIIVHIJ^SVEI^ 

LKQVTEKLKEHF*nTCTIIFQPPSGGDIjEITMHHFN^ 

MYAPPISGRINCVSNITGILLTRIX^AKnymJETFRPGGGNIKDNWRSEL 

FGFLGAAGSTMGAASI TLTVOAROLLSGIVQOQSNLLRAI EAQQHLLQLTVWGI KQLQARVLAVERYIjKDQKFLGLWGCSGKI IC 
TTAVPV^STWSimSFEEIWNNMTWI 

GGLIGLRIIFAVLSTVNRVRQGYSPLSFQTPTHHQREPDRPERIEEGGGEQGRDRSVR^ 

ILIAARTVELLGHS SliKGLRRGWEGLKYLGNLLL YWGQELKI SAI SLLDATAIAVAGWTDRVI EVAQGAWRAILH I PRRIRQGLE 
RALL 

*Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the W W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the W W* will be deleted as 140CF. 

Con-AEOl-2003 140CF.pep (638 a. a.) 
Nick name: 008 

MRVKETQMNW PNLWKWGTL, I LGLVI IC S A S DNL.WVTVYYGV P VWRD ADTTL F CAS DAKAHETEVHNVWATHACVPTDPNPQE I HL 

ENVTEaSFNMWKNNMVEQMQEDVISLWDQSL^ 

QKVHALFYKLDIVQIEDNNSYRLINCNTSVIKQACPKISFDP 

STQLLLNGSLAEEEI IIRSENLTNNAKTI IVHLNKSVEINCTRPSNNTRTSITIGPGQVFYRTGDIIGDIRKAYCEINGTKWNEV 
LKQVTEKLKEHFNNKTIIFQPPSGGDLEITMHHFNCRGEFFYCNTTKLFNNTCIGNETMEG 

^^YAP PI SGRINCV/SNI TG ILLTRDGGANNTNETFRPGGGN I KDNWRS EI/YKYKWQI EPLG I APT RAKTLTVQARQLLiSGIVQQQ 
SNLZiRAX&AQQHI<LQIjTVWGXKQZ4QARVI«AV£RYIjKZ>QKF£iGIjWGCSGK I ICTTAVPWNSTWSNRSFEE IWNNMTWI EWERS I SN 
YTNQIYE I LTESQNQQDRNEKDLLEXlDKWAS LWNWFDITNWLW * 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON-OPTXMIZED Con-AEOl-2003 140CF. seq (1945 nt . ) 
Nick name: 008 

ttcagtcgacagccaccATGCGAGTCAAGGAAACACAAATGAACTGGCCTAATCTGTGGAAGTGGGGCACCCTGAT 

(K3TCATTATTTGCTCTGCGAGC<^CAATCTCTGGGTTACTGTCTATTACGGAGTCC 

TTCTGCGCCTCAGATGCCAAAGCTCATGAAACTGAAGTGCATAATGTTTGGGC^ 

C C C AAGAAATAC AC CTGGAAAAC GTGAC C GAGAACTTTAATATGTGG AAG AATAAC ATGGTTG AAC AGATGC AAG AAGACGTAAT 
CAGCCTGTGGGATCAAAGTCTGAAACCTTGCGTAAAACTGACTCCACTTTGCGTAACACTT/^ 
AACGTTAACAACATCACTAACGTCTCCAACATCATCGGCAACATAACGAACGAAGTGAGAAATTGCAGTTTCAA 
AGCTCCGGGACAAGAAACAGAAGGTCCATGCTCTCTTTTACAAACTCGACATCGTCCAGATC 

TATAAATTGTAATAC ATC C GTGATTAAAC AAGC ATGCCCC AAAATAAGCTTC GATCC TATTC C TATCC AC TACTGTAC TC C TGC C 

GGCTATGCTATCTTGAAATGCAATGATAAGAACTTCAATGGGACCGGACCTTGTAAGAACGTGTCT 

GCATTAAACCAGTGGTAAGCACCCAGCTGCTCCTGAACGGCTCTCTGGCAGAGGAAGAGATTATTATTCGAA 

C AAC AACGCTAAGACTATC ATC GTAC ATC TCAATAAATCAGTCGAAATTAATTGC ACC AGAC C C TC CAAT AATAC TAGAAC TTC A 
ATCACTATCGGCCCAGGACAAGTCTTTTATAGAACAGGAGATATCATAGGAGATATCAGA 

caaaatggaacgaagtactcaaacaagtcacagagaagcttaaggaacatttca^ 

TGGCGGAGACCTCGAAATCACTATGCACCACTTCAACTGCCGCGGCGAATTTTTTTATTG 



ACGTGCATCGGAAATGAGACCATGGAGGGCTGCAATGGAACAATCATACTCCCATGCAAGATAAAACA 

AAGGTGCTGGACAAGCTATGTATGCACCCCCAATATCCGGTAGAATTAATTGCGTCAGCAACATCACTGGCATACTGCTCACTAG 

AGACGGAGGAGCAAATAATACAAA.TGAAACATTCCGACCAGGCGGCGGCAACATTAAGGACAACTGGCGGTCCGAACTCTATAAG 

TACAAAGTCGTACAGATCGAACCTCTTGGAATAGCACCGACTCGCGCTAAGACACTCACAGTACAGGCCCGACAACTTCTTTCTG 

GAATCGTACAGCAGCAATCCAACCTCCTCCGCGCAATCGAGGCCCAACAACATCTGCTTCAGCTCACAGTTT 

GCTCCAGGCACGCGTGCTCGCAGTGGAAAGATACCTGAAGGATCAGAAATTCCTTGGTC 

TGCACTACCGCGGTTCCCTGGAATTCAACATGGAGCAACCGGAGTTTTGAAGAGATATGGAACAATATGACATGGATAGAGTGGG 
AAAGGG AAATTAGTAAC TAT AC GAAC C AGAT ATACGAAATC C TC ACCGAAAGCC AAAATC AGC AG G ATCGC AAC GAAAAAGAC C T 
CCTCGAGCTTGATAAGTGGGCATCCCTTTGGAACTGGTTCGACATCACAAATTGGCTCTGGtaaagatcttacaa 



d-type subtype A Env 
*0 0KE_MSA4 07 6 -A (Subtype A, 891 a. a) 

MGAMG I QMNWQNLWRWGTM I LGML I ICSVAEKSWVTVYYGVPVWRDAETTLFCASDAKAHDKEVHNVWATHACVPTDPN 

E^n^^EDFNMWKNSWEQMHTDIISLWDQSLKPCVKLTPLCWIiNCSDSNITS^ISTSN^ 

KQKVYSLFYRLDWQINENSSDYRLINCOT 

VVTTQLLLNGSLAEEEVMIRSENITENAKNIIVQFKEPVQIICIRPGNNTRKSVHIGPGQAFYATGDIIGDIRQAH 

KTLQEVATQLRKHFRNOTKIIFTNSSGGDVEITTHSFNCGGEFFYCDTSGLFNSSWTASNDSM 

WORAGOAMYAPPIPGIIRCESNITGLII/TRDGGEGI^STNE^ 

RAVGIjGAVFIGFLGAAGSTMGAASMT LiTVOAROLiLSG I VOOOSNLIiRAI EAOOHIjIjKLiTVWGI KQLOARVIAVERYLiRDQQIjLGI 
WGC SGKL ICTTWPVmS SWSmSLDEIWENMTWMQWDKEVSOTTQ 

YIKIFIMIVGGLIGIiRIVPAVLSVINRVRQGYSPLSFQTHTPNPRGLDRPGRIEEEGGEQDRDRSIRIiVSGFIAIAWD 

FS YHRLRDFI LI AARTLELIXSHNSLKGLRIiGWEGLKYLWNLLAYWGRELKI SAI SLVDSIAI AVAGWTDRI I EIVQAIGRAILHI 

PRRIRQGLERALI 

*Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the *W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the n W" will be deleted in 140CF 
design. 

00KE_MSA4076-A 140CF.pep (647 a. a) 



MGAMGI QMNWQNLWRWGTMI LGML 1 1 C SVAEKS WVTNTYbf GVPVWRDAETTLFC ASDAKAHDKE^/HNVWATHACVPTD PNPQEM I L 
ENV^EDFNMWKNSM\^QMHTDIISLVJDQSIjKPCVKIjTPI^ 

KQKVYSLFYRLDWQINENSSDYRLINCNTSAITQACPKVTFEPIPIHYCAPAGFAILKCNDKKFNGTGPCTW 
WTTQIiLLNGSIiAEEEVMIRSENITENAKNI I VQFKE PVQI I C I RPGNNTRKSVH I GPGQAFYATGDI IGDI RQAHCNVSRELWN 
KTLQEVATQLRKHFRWTKIIFTNSSGGDVEITTHSFN^ 

WQRAGQ AMYAP P I PGIIRCE SNITGLI LTRDGG EGNN S TNET F RPVGGNMRDNWR S ELYKYKWKVE PLGVAPTKSRTT/TVQARQ 
LLSGIVQQQSNLLRAIEAQQHIAJtl/rW 

MQWDKEVSNYTQMIYlgI*I<EESQNQOEKireOEI«IAI«DKWANLWNWFNISNV^W* 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON-OPTXMXZED 0 0KE_MSA4 0 7 6 -A 14 0CF. seq (1972 nt . ) 
Nick names Oil 

1 1 C ag t C ga cag C C ac C ATGGGGGC AATGGGAATC C AGATGAACTGGC AG AACC TCTGGCGATGGGGC AC AATGATCCTGGGTAT 

GCTCATCATCTGCTCTGTTGCAGAAAAGTCATGGGTAACAGTCTACTACGGCGTACCAGTGTGGCGGGACGCCGAA 

TTCTGCGCCTCCGATGCCAAAGCACACGATAAAGAAGTCCACAATGTTTGGGCTACCCATGCGTGCGT 

CACAAGAAATGATACTCGAAAACGTTACTGAAGACTTCAACATGTGGAAAAATTCTATGGTTGAACAGATGC 

ATCACTGTGGGATCAGTCTCTCAAACCCTGTGTCAAATTGACCCCCCTCTGCGTTACACTGAACTC 

TCTAATTCAACGAGCAATAGTACGAAAGACTCCGCAACCCTTGATATGAAAAGCGAAATACAGAACTGTTC 

C CG AACTG AGAG ATAAAAAGC AGAAGGTTTATTC TCTGTTC TATCGATTGG AC G TGGTTC AGATTAACGAAAATAGC AGC GATTA 
CCGACTCATTAACTGCAATACATCAGC/VATCACACAGGCTTGCCCAAAGGTAACATTTGAGCCAAT^CCTATTCACTACTGCGCC 
CCTGCAGGATTTGCCATCCTGAAATGCAACGATAAGAAGTTTAATG 

CCCACGGCATAAAACCTGTTGTTACCACACAATTGCTGCTCAATGGATCACTTGCTGAAG 

C ATC ACTGAAAATGCC AAAAATATTATAGTTC AGTTC AAAGAACCC GTC C AG ATC ATTTGC ATTCGCC CTGGTAACAACACTCGC 

AAGTCAGTGCACATTGGGCCCGGCCAGGCTTTCTATGCAACCGGAGATATTATAGGCGACATCAGACAGGCACATTGCAACG 

GCCGGGAATTGTGGAACAAAACTTTGCAGGAAGTTGCTACTCAGCTGCGAAAACATT^ 

T AATTC ATC AGGC GGTGAC GTGG AG ATC AC T AC CC ATTC ATTTAAC TGTGGC GG AG AATTC TTC TATTGC G ATAC C TC TGGGC TC 

TTTAATTCCTCATGGACTGCTAGCAACGATTCAATGCAAGAAGCACATTCCACAGAAAGTAATAT 

AACAAATCATCAATATGTGGCAGCGGGCCGGTCAAGCAATGTACGCACCTCCCATCCCCGGAATTATTCG 

CACTGGCCTCATTCTGACCCGAGACGGTGGCGAAGGTAATAATTCTACAAACGAGACTTTCA 

GACAATTGGCGATCCGAACTGTATAAATATAAAGTGGTGAAGGTAGAACCTCTT^ 

CTGTGCAGGCACGCCAACTTCTGAGCGGAATAGTCCAACAGCAATCCAATCTTCTC 

TAAACTTACGGTGTGGGGAATCAAACAATTGCAGGCAAGAGTGCTGGCAGTGGAACGATACTTGAG 

ATCTGGGGATGTTCCGGTAAGTTGATTTGCACGACAAACGTTCCCTGGAACTCTTC^ 

GGGAAAATATGACATGGATGCAGTGGGACAAGGAAGTTAGCAACTATACACAGATGATCTACAACCTCCTC 





Oil 
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TC AAC AGGAAAAAAAC GAAC AAGAACTGCTC 
TGGtaaagatcttacaa 

\ Wild-type subtype B 

QH0515.1g gpl60 (861a. a) 

V MRVKEIRIWCQRLRRWGTMLLGMLMICSATEQLWVTVYYGVPVWKEATTTL 
\ ENVTENFNMWKNNMVEQMHEDIISLWEQSIiKPCVKLTPUTVTLNCTDKL 

EYSLFYKX.DVIPIDSRNNSNNSTEFSSYRLISCOT 

HGIKPVVSTQLLJjNGSI*A£EEVVIRSENFTNNVKSIIVQL^ 

AQWNlSrTLKQIVIKLREQFGNKTIVFNQSSGGDVEIVTCHSFNCGGEFFYCNSTQL 
TVNIWQKVGKAMYAPPIRGQIRCSSKITGLILTRDGG™^ 

QREKRAVGT IGAMFLiGFLiGAAGSTMGAASIjT IjTVOARLtljLi SGIVQOONNLiIjRAI EAOOHLIjOIjTVWGI KQLOARVLtAVERYLtRDO 
. QLJjGIWGCSGRLICTTWPWOTSWS^SLNYIWDNMTWMQ™ 
TNWLWYIKIFIMIVGGLIGIjRIVFAVLiS IVNRVRQGYSPLSLQTHLPARRGPDRPEGIGEEGGERDRDRSVRLVHGFIxAIjVWEDIj 

rsixlfsyhrlrdlllivartveilgqrgwralkywwnlllywslel e IARRI FRAFLH I PT 

RIRQGLERALL 

♦Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the **W underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the W W W will be deleted in 140CF 
design 

_ QH0515.1g 140CF (651a. a) 
\iS Nick name: 012 



MRVKEIRRNCQRLRRWGTMLLGMI^ICSATEQLWVTVYYGVPVWKEATTTL 
ENVTENFNMWKISntfMV^Q 

EYSLFYKLDVIPIDSRNNSNNSTEFSSYRLISCNTSVITQACPKISFEPIPIHYCA 
HGIKPWSTQLLIJtfGSLAEEEWIRSENFTNN^ 
AQWNNTLKQIVIKLREQFGNKTIVFNQSSGGDVEIVMHSFN^ 
IVNMWQKVGKAMYAPPIRGQIRCSSKITGLIIiTRDGGTNGTbn^ 

QARUJLSG I VQ QQNNLLRAIEAQQHLLQLTVVJG IKQLQARVXAVERVLRDQQIjLfGIWGC SGRL ICTTNVPWNT SWS NRS LNY XWB 
NMTWMQWDRE I NNYTDYIYTLLEDAQNQQEKHEQF,T«TiKTtnKWAS LWNWFD XTNWLW * 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON-OPTXMXZED QH0515.1g UOCF.seq (1984 nt.) 
Nick name: 012 

1 1 cag t cgac age c ac c ATGAGAGTAAAAGAAATC AGACGC AACTGTC AGAGGTTGAGGAGATGGGGAACGATGCTCCTGGGC AT 
GCTGATGATTTGCAGTGCCACCGAACAGCTTTGGGTAACCGTGTACTATGGTGTAC^ 

ttttgcgcgtccgacgcaaaagcctacgtaacagaaaagcacaacgtgtg<^ 

CTCAGGAAGTCGTTCTGGAAAATGTAACAGAAAATTTTAATATGTGGAAAAACAATATGGTAGA 
CTCACTGTGGGAACAATCCTTGAAACCTTGTGTCAAACTGACCCCACTTTC^ 

GATAC GTC CGGAAC AAATTC AAGC AGC TGGGAAAAAGTGC AAAAGGGC G AAATC AAAAATTGTTC ATTTAAC ATC AC TAC C GGTA 
TCAGAGGGCGGGTACAGGAATATTCTCTTTTCTACAAACTCGAC 

AGAATTTAGTAGTTATCGCCTTATAAGCTGCAACACCAGCGTGATTACACAAGCGTGCCCTAAAATCTCTT^ 
ATTCACTACTGCGCACCAGCCGGCTTCGCCATCCTCAAATGTAACGACAAGAAATTTAACGGAACCGGACC 

CC AC CGTTCAATGC ACTC ATGGAATC AAGCCCGTCGTTTCTACCCAACTTCTTC TC AATGGTAGCCTTGCGGAGGAGGAAGTTGT 
GATTCGCTCCGAAAATTTTACAAACAACGTCAAGTCAATCATCGTCCAGCTTAATA 

AACAATAACACCAGAAAATCCATTCACATAGGGGCCGGGAAAGCTCTGTATACCGGGGAAATTATTGGAGACATCAGACAAGCAC 

ACTGTAACTTGAGTCGCGCCCAGTGGAACAACACATTGAAACAGATCGTGATCAAGCTCAGAGAGCAGTTC 

CGTGTTTAATCAGAGCTCCGGCGGTGATGTCGAAATCGTAATGCACTCTTTTAATTGTGGGG 

ACACAATTGTTTAACAGCACCTGGAACGGCAATGACACATGGAATGACACCTGGAAAGATACGACAAATGATAATATO 

CGTGCAGAATAAAGCAAATCGTAAATATGTGGCAAAAAGTGGGCAAGGCCATGTACGCACCACCTATAAGAGGACAA 

TTCTTCCAAGATCACAGGTCTGATACTCACACGGGACGGAGGCACGAACGGGACAAACGAGACCGAGACCTTCCGACCAGGAGGC 

GGCAACATGAAGGATAACTGGAGAAGTGAACTTTACAAGTATAAAGTGGTCAAGATTGAGCCTCTGGGTATCG 

CTAAAACACTCACCGTGCAGGCTAGATTGCTGCTTTCAGGGATAGTCCAACAACAGAACAACCTTCTTAGAGCCA 

ACAACACTTGCTGCAGTTGACAGTGTGGGGAATTAAACAGTTGCAGGCCCGG 

CAGCTTTTGGGTATCTGGGGGTGTTCAGGCCGCCTCATATGCACCACAAATGTCCCTTG 

TT AATTAT ATTTGGG AC AATATG AC ATGGATGC AATGGG ATAGAGAAATT AATAAC TAC AC CGAC TAC ATCTAC AC ACTTC TGG A 

GGACGCCCAGAATCAGCAGGAGAAGAACGAGCAGGAACTCCTCGAATTGGATAAGTGGGCATCACTGTGGAATTGGTTCGATATA 
ACTAATTGGCTTTGGtaaagatc t tacaa 



I Wild-type subtype C 
(j\ DU123.6 gpl60(854 a. a) 

n MRVKGIQRNWPQWWIWGILGFWMIIICRWGNLWV^ 
GNVTENFNMWKNDMVTDQMH^ 

KQKVYALFYRPDVVPLNENSSSYILINCNTSTTTQAC PKVSFDPI PIHYCAPAGYAILKCNNKTFNGTGPCHNVSTVQCTHGIKP 
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WSTQLLLNGSLAEEEI I IRSENLTNNAKTI IVHLNES IEIVCTRPNNNTRKSIRIGPGQTVYATNDI IGDIRQAHCNI SKTKWN 

TTLEK\nCEKLKEHFPSKAITFQPHSGGDLEVTTHSFNCRGEFFYCDTC 

APPVEGNITCNSSITGLLLVRDGGIOTSNSTPEIFRPGGGNMKDNV^ 

FGFIXSAAGSTMGAASIT LTVOAROLLSGrVOOOSN^ 

PTWPWNSSWSI^SQTDIWDNMTV^QWDREISNYTGTIYKLLEESQ 

GGLIGLRIIFGVLSrWCRVTlQGYSPLSFOTL^^ 

ILVAARAVEIJ^RSSLRGLQRGWEALKYIjGNLiVQYGGIjELiKRRAI SLFDTIAIAVAEGTDRILEVIIjRI irairni ptrirqgfe 
AALL 

DU123.6 140CF (638 a. a) 
Nick name : 013 

mrvkgiqrnwpqwwiwgilgfwmiiicrwgnl^ 

gnvtenfnmwkndmvdqmhediisivtoqslkpcvkltplcvtlnctdvkx^ 

kqkv^alfyrpdwpi^ensssyilincntstttqacpkvsfdpipih^ 

wstqlllngslaeeei 1 1 rsenltnnakti i vhlnes i eivctrpnnntrks irigpgqtvyatndi igdirqahcni sktkwn 

TTLEKVKEKIjKEHFPSKAITFQPHSGGDLEVTTHSFNCRGEFFYCDTTKLFNESNLNTTNT^ 

AP PVEGNITCNS S I TGLLLVRDGGNTSNST PE I FRPGGGNMKDNWRS ELYKYKWEI KPLrGVAPTKAKTItTVQARQliLSGIVQQQ 
SNLLItAJKAQQHMI^LTVWGraQI^ 

YTGTZYKZXEESQNQQEKNEKDUJkliDSVTKKZiWSWFDITMWLW* 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 



CODON-OPTXMXZEX> DU123.6 140CF.S©Q (1945 nt.) 
Nick name: 013 

ttcagtcgacagccaccATGCGCGTAAAGGGGATTCAAAGAAATTGGCCGCAA 

GATAATTATATGCCGCGTTGTCGGAAATTTGTGGGTGACTGTGTACTACGGGGTGCCCGTGTGGACTGAGGCAAA 

TTCTGTGCTAGCGATGCCAAAGCCTATGAACGCGAAGTGCACAATGTTTGGGCTACTCATGCCTGTGTCCCTACC 

C TC AGG AAAT AGTGC TC GG C AATGT AAC GGAAAAC TTC AAC ATGTGG AAAAATG AT ATGGTGG ATC AG ATGC AC G AAGAC ATT AT 

CTCAATCTGGGACCAAAGCCTGAAACCCTGCGTTAAACTGACTCCTCTCTC 

GCCACCTCAAACGGTACGACAACTTACAACAATTCTATTGACTCTATGAACGGCGAAATCAAAAATTGTTC 

CCGAGATACGCGACAAAAAGCAGAAGGTCTATGCCCTTTTTTACCGCCCCGACGTAGTCCCACTCAACGAGA^ 

CATCCTCATCAACTGCAATACATCAACTACCACACAAGCATGCCCGAAAGTTAGCTTTGATCCAATTCCTATACATTACTGCGCC 

CCCGCCGGCTACGCTATACTGAAATGCAATAATAAGACTTTTAACGGGACCGGCCCATGTCACAACGTGTC 

CTCATGGCATCAAGCCCGTGGTGTCAACCCAGCTGCTGCTCAATGGCTCACTTGCAGAAGAAGAAATT 

TCTTACTAACAATGCAAAAACGATTATCGTGCACCTTAATGAATCAATAGAAATC 

AAAAGCATTCGCATCGGACCTGGCCAGACAGTTTACGCAACTAATGACATCATC 

CTAAAACCAAGTGGAATACAACCCTGGAAAAAGTAAAGGAAAAACTTAAAGAACATTTTCCCTCTAA 
TCACAGTGGCGGAGACTTGGAAGTCACAACACATTCTTTTAACTGCCGCGGAGAATTTT^ 

AATGAATCAAATCTCAACACCACAAATACAACCACACTGACCCTCCCCTGTAGAATCAAACAAATCGTAAACATGTGGCAAGGGG 

TTGGAAGGGCTATGTACGCTCCCCCCGTCGAAGGAAATATAACGTGTAACAGCAGCATCACTGGGCTGCTT^ 

AGGCAATACTTCTAATTCAACTCCTGAAATTTTTAGGCCTGGC 

TACAAAGTTGTTGAAATTAAGCCCCTGGGAGTCGCTCCAACCAAAGCTAAAACACTCACAGTGCAAGCAAGACAGCTC 
GCATCGTCCAGCAACAGTC AAATC TCCTTAGAGC AATCGAAGCC C AACAGC ATATGCTCC AACTC AC AGTCTGGGGGATTAAAC A 
GCTTCAAGCCCGCGTGCTTGCTATCGAACGCTATCTTAAAGACCAACAGCTTCTTGGCCTCTGGGGTT^ 
TGCCCCACCACCGTGCCTTGGAATAGTTCTTGGAGTAATAAATCACAGACCGATATTTGGGACAACATC 
ATAGGGAAATTTCTAATTATACTGGCACAATCTACAAACTCTTGGAAGAAAGTCAAAATCAGCAAG 
CCTCGCCCTGGACTCCTGGAAGAATCTTTGGAGCTGGTTCGACATAACTAATTGGCTGTGGtaaaga t C t tacaa 

Wild- type subtype CRF01_AE 

97CNGX2F-AE (854 a. a.) 

MRVKETQMhWPNLWKWGTLIIXSLVI^ 

EN\7TENFNMVTONNMVEQMQEI>VI S LWDQ SL.K PC VKLTPLC VTLNC TNANWTNSNNTTNG PNK I GN I TDEVKNCTFNMTTELKDKK 
QKVHALFYKLDIVQINS SEYRLINCNTSVI KQAC PKI SFDP I P I HYCTPAGYAI LKCNDKNFNGTGPCKNVS SVQCTHGI KPWS 
TQLLLNGSLAEEEI I IRSENLTNNAKTI I VHLNKSVEINCTRPSNNTRTS ITMGPGQVFYRTGDIIGDIRKAYCEINGIKWNEVL 
VQVTGKLKSHFNKTIIFQPPSGGDLEIITHHFSCRGEFFYCNTTKLFNNTCIGNTSMEGCNN^ 
APPISGRINCVSNITGILLTRIX^ADNNTTNETFRPGGGNIK^^ 

FGFLGAAG STMGAAS ITLiTVQARQLL SG I VQQQSNLLRAI EAQQHLLQLTVWG I KQLQARVLAVERYLKDQKFLGLWGC SGKI IC 

TTAVPWNSSWSNKSFEEIWDNMTWIEVreREISNYTSQlYEILTESQNM^ 

GSLIGLRIIFAVLSrVNRVRQGYSPLSFQTPTHHQREPDRPEEIGEGGGEQSK^ 

I L I AARTVELLGH S SLKGLiRRGWEGLKYLGNLLL YWGQE I KI S A I S LLNATAI AVAGWTDR VI EVAQRAWRAIjIjH I PRRI RQGL.E 
RALL 

♦Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the "W* underlined with red color is the last amino acid at the C 
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terminus, and all the remaining amino acids after the **W* will be deleted in 140CF 
design. 

97CNGX2F-AE 140CF.pep (629 a. a.) 
Nick name: 018 

MRVKETQMNWPNLWKWGTLILGLVIICSASDNLW 
ENWENFNMWRNNMVEQMQEDVI SLWDQSLKPCVKI/^ 
QKVHALFYKLDIVQINSSEYRLINCNTSVIKQACPKISFDPIPIHYCTP 

TQLLLNGSLAEEEI I IRSENLTNNAKTI I VHLNKSVEINCTRPSNNTRTSITMGPGQVFYRTGDIIGDIRKAYCEINGIKWNE^ 

VQVTGKLKEHFNKTIIFQPPSGGDLEIITHHFSCRGEFFYCNTTKLFNNTCIGNT 

_APPISGRINCVSNITGILLTRI3GGADNNTTNETFRPGGGNIKDNWRSELYKYKVVEIEPLGIAP^ 

" SNLLRAIEAQQHLLQLTVWG IKQI-Q^VLAVERYLKIJQKFLGLWGC SGKI ICTTAVFWNS SWSNK S FEE IWDNMTWIEWEREI SN 
YTSQIYEILTESQNQQDRNEKDLLEI^KWASLWNW* 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 

CODON- OPTIMIZED 97CNGX2F-AE 140CF.SOQ: (1921 nt.) 
Nick nana: 018 

1 1 c ag t cgac age c ac c ATGCGAGTAAAAGAGAC AC AAATGAATTGGCCC AATTTGTGGAAGTGGGGAAC ATTGATCCTGGGACT 
GGTGATAATCTGTAGTGCATCCGACAATCTCTGGGTGACCGTTTACTATGGTC 

TTCTGTGCAAGCGACGCC AAAGCCC ACGAAAC TGAAGTCC ATAATGTATGGGCC ACCC ACGCGTGCGTACCAACC GACCCTAATC 

CCCAAGAGATCCACCTTGAGAATGTAACTGAGAATTTTAACATGTGGAGAAATAACATGGTGGAACAAAT 

TTCCTTGTGGGACCAGAGCCTTAAACCTTGTGTCAAATTGACTC 

AACAGCAACAACACTACCAACGGCCCTAACAAAATTGGCAATATTACTGATGAAGTCAAGAACTGCACTT^ 

AACTGAAGGATAAG AAAC AG AAAGTC C ATGC TC TGTTCTATAAGC TCG AC AT AGTAC AAATTAATAGC TC AGAATATAGAC TGAT 

AAACTGCAATACTTCCGTTATCAAACAGGCCTGTCCAAAGATAAGCTTCGATCCC^ 

TACGCTATCCTGAAATGCAACGATAAGAATTTTAACGGCACAGGTCCCTGCAAAAACGTTTCCTCTC 

TCAAGCCTGTAGTATCAACACAACTGCTCCTGAATGGCTCCTTGGCCGAAGAAGAGATCATCATTAGAAGTGAGAACCTGACGAA 
CAACGCCAAGACTATAATAGTGCACCTCAATAAATCTGTAGAAATCAACTGTACCCGACCCTCAAACAACACTCGAACAAGTA 
ACAATGGGCCCTGGCCAAGTTTTTTACCGGACCGGCGACATAATAGGCGATATCAGAAAGGCATATTGCGAGATC 
AGTGGAACGAAGTACTGGTTC AAGTAACTGGAAAACTCAAAGAAC ATTTTAATAAGACC CAGCCCCCGAGTGGCG G 

CGACCTCGAGATTATCACCCATCACTTTTCTTGTAGAGGCGAATTTTTTTACTG 

ATCGGG AAC AC TTC T ATGG AAGG ATGT AAT AAT AC C ATT AT AC TG C C C TGTAAG ATC AAGC AG ATT ATC AAC ATGTGGC AGGG AG 
TAGGTCAGGCAATGTACGCACCACCGATTTCAGGACGGATCAATTGCGTATCAAATATC 

AGGC GC AG AC AAC AATACC AC TAAC G AGAC ATTTAGACC TGGAGGC GGC AATATAAAGGAT AATTGGAGAAGTGAGC TGTATAAA 

TACAAAGTCGTAGAGATCGAACCCCTCGGCATTGCTCCAACCCGGCiX^CCGGACTCTCACCGTACA^ 

GCATAGTCCAACAGCAGTCAAACCTCCTCCGCGCTATTGAAGCACAAGAACACCTGCTCCAGCTGACTGTGTG 

ATTGCAAGCAAGAGTGCTCGCCGTGGAACGCTATTTGAAAGATCAGAAATTTCTTGGACTT 

TGTACAACAGCGGTGCCTTGGAACTCATCCTGGAGTAATAAAAGCTTTGAAGAAATCTGGGACAATATGA 

AGAGAGAGATTTCAAACTATACAAGCCAAATTTACGAAATACTGACAGAAAGTCAAAACCAGCAGGACAGAA 

GCTCGAACTGGATAAGTGGGCCTCTTTGTGGAACTGGtaaagatcttacaa 

Wild-type DRCBL-G (854a. a.) 

MRVKG I QRNWQHLWNWG IL I LGLVI IC S AEKLWVTVYYGVPVWEDANAPLFC AS DAKAH STESHNI WATHAC VPTD P S PQE I NMR 
NVTENFNMWKNNMVEQMHEDI ISLWDESLKPCVKLTPLXrVTTLNCTEINNNST 
TDWPINEMNNENNGTNSTWYRLTNCIWSTIKQACPKVTFEPIPIHYCAPAGF^^ 

STQLLLNGSLAEKDIIISSENISDNAKVI IVHI#NRSVEINCTRPNNNTRRSVAIGPGQAFYTTGEVIGDIRKAHCNVSWTKWNET 
LRDVQAKI^EYFINKSIEFNSSSGGDLEITTHSFNCGGEFFYCOT 

YAPPIAGNITCRSNITGLILTRIX^DNNSTSEIFRPGGGDMKNNWRSELYKYKTVKIKSl^ 

IX5FLGTAGSTMGAASITLTVQVRQLLSGIVQQQSNLLRAI EAQQHLLQLTVWGI KQLRARVLALERYLKDQQLLG I WGC SGKL I C 
TTWPWOT'SWSNKSYNEIWENMTWIEV^REIDNYTYHIYSLIEQSQIQQEKNEQD 
GGLIGLRIWAVLSIWRVRQGYSPLSFQTLLHHQREPDRPAGIEEGGGEQDRDRSIRLVSGFIJU^ 
ILIAARTVELIXSRNSLKGLRLGWEALKYLWNLLLYWARELKNS^ 

*Amino acid sequence underlined is the fusion domain that will be deleted in 140CF 
design and the tt W* underlined with red color is the last amino acid at the C 
terminus, and all the remaining amino acids after the tt W" will be deleted in 140CF 
design. 



DRCBL-G 140CF.pep (630 a. a.) 
Nick name: 017 

MRVKG I QRNWQHLWNWG I LI LGLVI ICSAEKLWVTVY^ 
NVTEWFNMWKNNMVEQMHEDI ISLWDESLKPCVKLTPIX^ 



TDWPINEMNNENNGTNSTWYRI/mCNVST^ 

STQLLLiNGSLAEKDI II SSENI SDNAKVI IVHIjNRSVEINCTRPNNNTRRSVAIGPGQAFYTTGEVIGDIRKAHCNVSVJTKWNET 

LRDVQAKLQEYFINKSIEFNSSSGGDLEITTHSFNCGGEFFYCNTSGLFNNSILKSNISENNDTITI^ 

YAPPIAGNITCRSNITGLILTRIXX3DNNSTSE 

SNLI*RAIEAQQHLLQLTVWGIKQLlRAJIVXJ\X»ERY 

YTYHIYSLIKQSQIQQEKNEQDLI*AI*DQWASIiWSW* 

♦Amino acids seen in blue color is for easy identification of the junction of the 
deleted fusion cleavage site. 



CODON-OPTIMIZED DRCBZi-G 140CF.se<a (1921 at. ) 
Nick name: 017 

1 1 c ag tcgacagccac c ATGAG AGTTAAAGGAATCC AACGC AATTGGC AAC ACCTTTGGAACTGGGGC ATATTGAT^ 

GGTGATAATTTGTAGCGCTGAAAAACTCTGGGTAACTGTCTATTACGGCG 

TGCGCAAGTGATGCAAAGGCTCACAGCACTGAATCTCACAACATTTGGGCC^ 

AGGAGATC AAC ATGAG AAAC GTT AC C GAAAATTTTAATATGTGG AAG AATAATATGGTGG AGC AAATGC ACGAAGAC AT AATTTC 
ACTCTGGGACGAGTCTCTGAAACCATGTGTGAAACTTACCCCCCTGTGCGTCACCCTGAACTG 

ACGAGAAAT ATC AC AG AAGAAT AC C GAATG AC TAAC TGTTCC TTTAATATGAC AACC G AAC TGC GAGAC AAAAAGAAGGCTGAAT 
ACGCACTTTTCTACCGAACAGATGTTGTACCAATCAACGAGATGAACAATGAAAACAATGGAACGAACTCT 

GAC AAACTGTAAC GTTAGC ACAATC AAGC AGGCCTGC CCTAAAGTC ACATTCGAACCAATACC AATTC ACTACTGC GC ACCCGCC 
GGATTC GCTATTCTTAAGTGCGTGGATAAGAAGTTTAACGGAACTGGAACCTGC AATAAT^ GCATG 
GAATT AAG C C TGTC GTTTC AAC C C AGTTGC TGCTG AATGGATC AC TC GC AGAAAAGGAT ATT ATTATC TC AAGC G AAAAC AT ATC 
TGATAATGCAAAGGTCATCATCGTCCACCTCAACCGCTCAGTTGAAATAAACTGK:ACTCGGCCTAA 

GTCGC AATC GGC C C AGGAC AAGC TTTTTAC AC TAC CGGGGAAGTT ATC GGCGAC AT AC GGAAAGCC C ACTGC AACGTTAGC TGG A 
CC AAGTGGAATGAAAC AC TGC GC G ATGTTC AAGCC AAAC TTC AAGAAT AC TTC AT AAAC AAATCAATTG AG TTC AATTC TAGC TC 
TGGCGGCGACCTCGAGATTACAACTCACTCCTTTAACTGCGGCGGCGAATTCTTTTATTGTAATACCTCCGG 
TCTATCCTCAAAAGTAACATTTCTGAAAATAATGACACAATCACACTGAATTGCAAGATCAAGCAGATT 

GAG TCGGAC AAGC TATGTACGCCCCACCC ATC GCCGGAAATATAACGTGTCGATC AAATATC ACTGGCCTC ATCCTTAC TAGAGA 

TGGCGGAGACAATAATAGCACCAGCGAGATATTCAGACCAGGCGGAGGCGATATGAAAAACAACTGGAGGTCAGAGCTCTACAAG 

TACAAAACAGTCAAAATTAAAAGCCTGGGCATTGCTCCCACTCGGGCCCGCACACTGACTGTCCAAGTC 

GAATCGTCCAACAACAGTCCAACTTGCTGCGCGCTATAGAGGCTCAACAACATCTCCTTCAACTGACTGTC 

ATTGAGAGCAAGAGTGCTGGCGCTGGAACGGTATCTTAAGGACCAACAACTCCTGGGCATATGGGGGTC 

TGCACAACAAATGTACCCTGGAACACCAGCTGGTCAAATAAAAGTTATAATGAGATATGGGAAAACATC 

AAAGGGAAATTG AC AATT ATAC ATAC CAT AT AT AC TC TC TC ATC G AAC AATCTC AGATAC AAC AGG AAAAGAATGAAC AAGATTT 
GTTGGC TCTT GAC C AATGGGC TTC TTTGTG G AGTTGG t aaaga tcttacaa 
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2003 Centralized HTV-1 Envelope Proteins and the Codon- Optimized Gene sequences 

, or) 

/ 2003 Cons Env 

fT MRVMG I QRNCQHL WRWG I LI FGML I ICSAAENLWVTVYYGVPWKKAlSrTT^ 
IVLENWENFNTWK1JNMVEQMHEDIISLVTO 

ALFYKLDVVPIDDNNSYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGPCKl^ 

* TQLLLNGSLAEEEI I IRSENITNNAKTI I VQLNESVE INCTRPNNNTRKS I RIGPGQAFYATGDIIGDIRQAHCNISRTKWN 

KTLQQVAKKLREHFNKTIIFNPSSGGDLEITTHSFNCGGEFFYOre 

APP I EGKI RCTSN ITGLLLTRDGGNNirraTFRPGGGDMRDNWRSELYKYKW 
. LGFLGAAGSTMGAASITLTVQARQDLSGIVQQQSNLIjRAIEAQ 

LICTTNVPWNSSWSNKSQDEIWDNMTWMEWDKEINNYTDIIYSLIEESQNQQE 
* IFIMIVGGLIGLRIVFAVIjSIVNRVRQGYSPIjSFQTLIPNPRGPDRPEGIEEEGGEQDRDRSIRIjVNGFIA 

F S YHRLRDL I Jj I AARTVELLGRRGWEALKYLWNLLQYWGQELKNS AI S LLDTTAI AVAEGTDRVI EWQRVCRAI LN I PRRI 

RQGFERALL $ 



; 3; 



2003 CON-S Env. s eg. opt 

ATGCGCGTGATGGGC^TC<^GCGOUVCTGCC^GCACCTGT^ 
CCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTC 
CGACGCGAAGGCCTACGAGACCGAGGTGCAGAACGTGTGGGCCACCCAC 

ATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGA 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGTGAACGCCACCAA 
CAACACCACCAACAACGAGGAGATCAAGAACTGCTCCTTCAACATCACCACCGAGATCCGCGACAAGAAGAAGAAGGT^ 
GCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACGACAACAACTCCTACCGC^ 
CCCAGGCCIX3CCCCAAGGTGTCCTTCGAGCCCATCCCC^^ 

CGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCACGGCATCA^ 

ACCCAGCTGCTGCTGAACGGC^CCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAACATCACCAACAACGCCAAGACCA 
TCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGTCCATCCGCAT 
CGGCCAXK3CCTTCTACGCCACCGGCGACATCAT 
AAGACCCTGCAGCAGGTGGCCAAGAAGCTGCGCGAGCACTT 
TGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTCTTCTAC 
CGGCkCCAACAACACCATCACCCTGCCCTGCCGCATC^ 

GCCCCCCCCATCGAGGGCAAGATCCGCTGCACCTCCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAACAACAACA 
CCGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGAGAACTGGCGCTCCGAGCTGTACAAGTAGAA 
GCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGC^ 
CTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCC^^ 
GCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTG 
GCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGC^ 
CTGATCTGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGGACC^ 
TGGAGTGGGACAAGGAGATCAACAACTACACCGACATCATCTACTCC 

CGAGCAGGAGCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTG 
ATCTTCATCATGATCX3TGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGTCGAT 

GCTACTCCCCCCTGTCCTTCCAGACCCTGATCCCCAACCCCCGCGGCCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGG 
CGAGCAGGACCGCGACCGCTCCATCCGCCTGGTGAACGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTG 
TTCTCCTACCACCGCCTGCGCGACCTGATCCTGATCGCCGCCCGCACCGTGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCC 
TGAAGTACCTGTGGAACCTGCTGCAGTACTGGGGCCAGGAGCTGAAGAACTCCGCCATCTCCCTGCTGGACACCACCGCCAT 
CGCCGTGGCCX3AGGGCACCGACCGCGTGATCGAGGTGGTGCAGCGCGTGTGCCGCGCCATCCTGAACATCCCCCGCCGCATC 
CGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 

^0 

2003 M. Group. AnC Env 

U{ MRVMG I QRNCQHLWRWG I Li I FGMLM 1 CSAAENI*WVTVYYGVPVWKEA1TI^IjFCASDAKAYDTEVHNVW 
/ I VLENVTENFNMWKNNMVEQMHED 1 1 SLWDQS LKPCVKLTPLCVTLNCTDVNATNNSTNMGE I KNCSFNITTE I RDKKQKVY 
AliFYTlLDVVPINDNNSYRLINCNTSAITQACPKVSFEPIPIHYCAPAGFAILKCNDKKFNGTGP 

TQLLLNGS LAEEE 1 1 IRSENITDNAKTI I VQLNESVE INCTRPNNNTRKS I R IGPGQ AF YATGD I IGDIRQAHCNX SGAEWN 
KTLQQVAAKLREHFNNKTIIFKPSSGGDLEITTHSFNCGGEFFYCNTSGLFNSTWNGTNETITLPCRIK 

YAPP I AGNITCKSN I TGLLLTRDGGTNNTETFRPGGGDMRDNWRSEL YKYKWKI EPLGVAPTKAKRRWEREKRAVGIGAV 
FLGFLGAAGSTMGAASITLTVQARQLLSGIVQQQSNLLRAIEAQQHLLQLTVWGIKQI^ARVlxAVERY 

KL I CTTNVP WNS SW SNKS QDE I WDNMTWMQWERE I SNYTDI I YSL I EESQNQQEKOTQDLLALDKWASLWNWFDITNWLWYI 
KIFIMIVGGLIGLRIVTAVXSIVNRVllQGYSPLSFQTLIPNPRGPDRPGGIEEEGGEQDRDRSIRLVSGFLALAWDDLRSIjC 
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LFSYHRLRDFILIAARTVELLGRRGWEAIjKY^ 
IRQGFERALL$ 

2003 M. Group. anc Bnv.seg.opt 

ATGCGCGTGATGGGCATCCAGCGCAACTGCCAGCACCTGTGGCGCTGGGGGATCCTGAT^ 

CCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAACACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACGACACCGAGGTGCACIAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
ATCGTGCTGGAGAACGTGACCGAGAACTTCAAC^TGTGGAAGAACAACATGGTGGAGCAGATG 

'TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCHX3ACCCCCC 
CAACTC CAC CAACATGGGCGAGATCAAG AACTGCTC CTTCAACATCACCAC CGAGATCCGCGACAAGAAGCAGAAGGTGTAC 
GCCCTGTTCTACCGCCTGGACGTGGTGCC(^TCAACGACAA<^^CTCCTACCGCCTGATCAACTC 

- CCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCCATCCCCATC^^ 
CGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCACGGCATCAAGC 
ACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAAC^ 
TCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGTC 

CGGCCAGGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCTCCGGCGCCGAGTGGAAC 

AAGACCCTGCAGCAGGTGGCCGCCAAGCTGCGCGAGCACTTGAACAACAAGACCATCATCTTCAAGC 

ACCTGGAGATCACCACCCACTCCTTGAACTGCGGCGGCGAGTTCTTC^ 

GAACGGCACCAACGAGACCATCACCCTGCCCTGCCGCATCAAGCAGATCGTGAACATGTGG 
TACGCCCCCCCC^TCGCCGGCAAC^TCACCTGCAAGTCCAACATCLA.CCGGCCTGCTGCTC 

ACACCGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGAAGAT 
CGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCGTG 
TTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGCTGCTGT 
CCGGCATCGTGCAGGAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGC^^ 

CAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGC 
AAGCTGATCTGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGGA 

GGATGCAGTGGGAGGGCGAGATCTCCAACTACACCGACATCATCTACTCCCTGATCGAGGAGTCCCAGAACCAGCAGGAGAA 

GAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATC 

AAGATCTTCATCATGATCGTGGGCGGCCTGATCX5GCCTGCGCATCGTGTTCGCCGTGCTGTCCATCGTGAACCGCGTGCGCC 

AGGGCTACTCCCCCCTGTCCTTCCAGACCCTGATCCCO^ACCCCCGCGGCCCCGACCGCCCCGGCGGCATCGAGGAGG^ 

CGGCGAGCAGGACCGCGACCX3CTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGC 

CTGTTCTCCTACCACCGCCTXSCGCGACTTCATCCTGATCGCCGCCCGCACCGTGG^ 

CCCTGAAGTACCTGTGGAACCTGCTGCAGTACTGGGGCCAGGAGCTGAAGAACTCCGCCATC 

CATCGCCGTGGCCGAGGGCACCGACCGCGTGATCGAGGTGGTGCAGCGCGCCTGCCGCGCCATCCTGCACATCCCCCGCCGC 
ATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 

2003 CONAl Env 

MRVMG I QRNCQHLLR WGTM I LGM I IICSAAENLWVTVYYGVPVWKDAETTLFCASD 

IHLENVTEEFNMWKNNMVEQMHTD IIS LWDQSLKPCVKIiT PLCVTI^CSNVNVTNNTTNTHEEE I KNC S FNMTTELRDKKQK 
VYSLiFYRLiDVVQINENNSNSSYRLINCNTSAITQACPKVS FEPIPIHYCAPAGFAILKCKDKEFNGTGPCKNVSTVQCTHGI 
KPWSTQLLLNGSLAEEEVI IRSENITNNAKTI IVQLTKPVKINCTRPNNNTRKS IRIGPGQAFYATGDI I GDI RQAHCNV S 
RSEWNKTL.QKVAKQLRKYFKNKTI IFTNSSGGDLEITTHSFNCGGEFFYCNTSGLFNSTWNNGTMKNTITLiPCRIKQI INMW 
QRAGQAMYAPPIQGVIRCESNITGI^IjTRDGGNNNTNETFRPGGGDMRDNWRSELY 

RAVGIGAVFLGFLiGAAGSTMGAAS ITI/TVQARQLIiSGI VQQQSNLiIjRAI EAQQHLJjKLTVWG IKQLQARVIAVERYIjKDQQLi 
I^IWGCSGKLICTTNVPWNSSWSNKSQNEIWDNMTWLQWDKEISNYTH 

ISNWLWYIKIFIMIVGGlilGLRIVFAVLSVINRVRQGYSPLSFQTHTPNPRGLDRPGRIEEEGGEQGRDRSIRLVSGFIiA 

WDDLRSLCLFSYHRIjRDFILIAARTVEIjIiGHSSIjKGLRIX3WEGLKYIiWNI*LL 

IGQRIGRAII*HIPRRIRQGLERALIj$ 

2003 CON_Al Bnv.seg.opt 

ATGCGCGTGATGGGCATCCAGCGCAACTGCCAGCACCTGCTGCGCTGGGGCACCATG 

ccgccgccgagaacctgtgggtgaccgtgtactacggcgtgcc^ 

cgacgccaaggcctacgagaccgagatgcacaacgtgtgggccacccacgcctgcgtgcccaccgaccccaacccccaggag 

ATCCACCTGGAGAACGTGACCGAGGAGTTCAACATGTGGAAG 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTC 

CAAGACCACCAACACCCACGAGGAGGAGATCAAGAACTGCTCCTTCAACATGACCA 

GTGTACTCCCTOTTCTACCGCCTGGACGTGGTGCAGATCAACGAGAACAA.CTCCAACTCCTCCTA 

ACACCTCCGCCATC^CCCAGGCCTGCCCC^^GGTGTCC^ 

CATCCTGAAGTGCAAGGACAAGGAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCACGGCATC 
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AAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGGTGATCATCCGCTCCGAGAACATCACCA 

ACAACGCCAAGACCATCATCGTGCAGCTGACCAAGCCCGTGAAGATCAACTGCACCCGCCCCAACAAG^^ 

CATCCGCATCGGCCCCGGCCAGGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACGTGTCC 

CGCTCCGAGTGGAACAAGACCCTGCAGAAGGTGGCCAAGCAGCTGCGCAAGTACTTCAAGAACAAGACCA 

ACTCCTCCGGCGGCGACCTGGAGATCTVCC^CCCACTCCTTCAA^ 

GTTCAACTCCACCTGGAACAACGGCACCATGAAGAACACCATCA 

GAGCXSCGCCGGCCAGGCCATGTACGCCCCCCCCATCCAGGGCGTGATCC^ 

" CCCGCGACGGCGGGAACAACAACACCAACGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGC^ 
GTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGCCGCGTGGTGGAGCGCGAGAAG 
CGCGCCGTGGGCATCGGCGCCGTGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGA 

- CCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCT 
GCTGAAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAG 
CTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGA 
ACGAGATCTGGGACAACATGACCTGGCTCCAGTGGGACAAGGAGATCTCCAACT 
GGAGTCCCAGAACCAGCAGGAGAAGAACGAGCJVGGACCTGCTGGCCCTGGACAAGT^ 

ATCTTCCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTC^ 

TGTCCGTGATCAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCITCCAGACCCACACCCCCAACCCCCGC 

CCCCGGCCGCATCGAGGAGGAGGGCGGCGAGCAGGGCCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCC 

TGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTTCATCCTGATCGCCGCCCGCACCGTGGAGC 

TGCTGGGCCACTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAAGTACCTGTGGAACCTGCTGCTGTACTGGG^ 

CCGCG AGCTGAAG ATCTCCG CCATCAACCTGGTGG ACAC CATCGCCATCGCCGTGGCCGG CTGGACCGACCGCGTGATCGAG 

ATCGGCCAGCGCATCGGCCGCGCCATCCTGCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 

2003 Al.AnC Env 

MRVMG I QRNCQHIiWRWGTM I FGMI 1 1 CS AAENL WVTVYYGVP VWKD AETTLF GASDAKAYDTEVHNVWATHACVPTD PNPQE 
IDLENVTEEFNMWKfcTOMVEQMHADIISL^ 

VYSLFYRIJDWPINENNSNSSYRLINC^rrSAITQACPKVSFEP 

KPWS TQL.LLNGS LAEEEVM XRSENI TDNAKT 1 I VQLTE P VKINCTRPNNNTRKS IRIGPGQAFYATGDI IGDIRQAHCNVS 

RTEWNKTI^KVAAQLRKHFNNKTIIFNSSSGGDLEIT^ 

QRVGQAMYAPPIQGVIRCESNITCLLIjTRDGGN^^ 

RAVGLGAVFLGFLGAAGSTMGAASITIjTVQARQIjLSGIVQQQSNIjL 

I^IWGCSGKLICTTNVPWNSSWSNKSQDEIWDNMTWIjQWDK 

ISNWLWYIKIFIMIVGGLIGI^IVFAVLSVINRWQGYSPLSFQTLTPN^ 

WDDLRSLCLFSYHRIjRDFXLXAARTVEIjIJGRSSLKGIjRIjGWEGL 

IGQRICRAILNIPRRIRQGLERALL$ 

2003 Al.anc Env.eeq.opt 

ATGCGCGTGATGGGCATCCAGCGCAACTGCCAGCACCTGTGGCGCTGGGGC^CCATO 

CCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGACGCCGAGACCACCCTGTTCTGCGCCTC 

CGACGCGAAGGCCTACGACACCGAGG1X3CACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGAC 

ATCGACCTGGAGAACGTGACCGAGGAGTTCAACATGTGGAAGAACAACATGGTGGAG 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCTCCAACGTGAACGTGACCAA 
CAACACCACCAACAC CCACGAGGAGGAGATCAAGAACTG CTCCTTCAACATGACCAC CGAGCTGCGCGAC AAGAAGCAGAAG 
GTGTACTCCCTGTTCTACCGCCTGGACGTGGTGCCCATCAACGAGAACAACTCCAACT 
ACACCTCCGCCATCACCC^GGCCTGCCCCAAGGTGTCCTTCGAGCCCATC 
CATCCTGAAGTGe^GGACAAGGAGTTCAACGGCACCGGCCCCTGCAAGAAC 

AAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGGTGATGATCCGCTCCGAGAACATCACC^ 
ACAACGCCAAGACCATCATCGTGCAGCTGACCGAGCCCGTGAAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGTC 
CATCCGCZATCGGCCCCGGCCAGGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACGTGTCC 
CGCACCGAGTGGAAGAAGACCCTGCAGAAGGTGGCCGCCCAGCTGCGCAAGCACTTCAACAACAAGACGATCATCTTCAACT 
CCTCCTCCGGCGGCGACCTGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCTCCGGCCT 
GTT CAACTC CAC C TGG AACAACGG CAC CATG AAGG ACAC CAT C AC C CTG C C CTG C C G CATCAAG CAG ATCAT CAACATGTGG 
GAGCGCGTGGGCCAGGCC1ATGTACGCCCCCCCGATCCAGGGCGTGATCCGCTGCGAGTCCAA 

CCCGCGACGGCGGCAACAACAACACCAACGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCT 
GTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGCCGCGTGGTGGAGCGCGAGAAG 
CGCGCCGTGGGCCTGGGCGCCGTGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGA 
CCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCT 
GCTGAAGCTGACCGTGTGGGG CATCAAG CAG CTG CAGG CCCG CGTG CTGGCCGTGGAG CG CT ACCTGAAGGACCAGCAGCTG 
CTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACGAACGTGCCCTGGAACTCC 
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ACGAGATCTGGGACAACATGACCTGGCTGCAGTGGGACAAGGAGATCTCCAACTA^^ 
GGAGTCCCAGAACCAGC^GGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTG 
ATCTCCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCA 
TGTCCGTGATCAACCGCGTGCGCC^GGGCTACTCCCCCCTGTC 

CCCCGGCCGC^TCGAGGAGGAGGGCGGCGAGCAGGGCCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCC 
TGGGACGACCTGCGCTCCCTGTGCCrrGTTCTC 

TGCTGGGCCGCTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAAGTACCTGTGGAACCTC 
m CCGCGAGCTGAAGATCTCCGCCATCAACCTGCTGGACACCATCGCCATCGCCGTGGCCGGCTGGACCGACCGCGTGATCGAG 
ATCGGCCAGCGCATCTGCCGCGCCATCCTGAACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 

. 2003 CON_A2 Env 

MRVMGTQRNYQHLWRWG I LIIjGMLIMCKATDLWVTVYYGVPVWKDADTTLFCASDAKAYDTEVHNVWATHACV 
NIjENVTEDFNMWKNNK^TEQMHEDIISIjWDQSL 

FYKLDVVQI^ESNKSEYYYRLINCNTSAITQACPKVSFEPIPIHYGAPAGFAIIjKCKDPRFNGTGSCNNV^ 

ASTQLLLNGSIjAEGKVMIRSENITNNAKNI ivqfnkpvpitcirpnnntrksirfgpgqafytndiigdirqahcninktkw 
N ATLQKVAEQLiREH F PNKT 1 1 FTN S S GGDL E I TTHS FNCGGE F FYCNTTGL FN S TWKNGTTNNTEQM I TLP CR I KQ 1 1 NMWQ 
RVGRAMYAPPIAGVIKCTSNITGIILTRDGGNNETETFRPGGGDMRDNTO 

VGMGAVFIjGFIjGAAGSTMGAAS ITIjTVQARQLIjSGI VQQQSI^DKAIRAQQHIjIjKIjTVWGI KQLQARVIAIjERYIjQDQ 

I WGCSGKLICATTVPWNSSWSNKTQEEI WNNMTWLQWDKEI SNYTNI I YKLLEESQNQQEKNEQDIjLAIjDKWANLWNWFNIT 

NWLWYIRIFIMIVGGLIGIJiiVIAIISVVNRVRQGYSPLSFQIPTPNPEG 

DLRSLCLFSYHRLRDCILIAART\^LLGHSSIjKGLRIX3WEGIjKYIiWNIiLL 

QRACRAILNIPRRIRQGFERALL,$ 

2003 CON_A2 Env . a oq . opt 

ATGCGCGTGATGGGCACCCAGCGGAACTACCAGCACCTGTGGC 

AGGCCACCGACCTGTGGGTGACCGTGTACrTACGGCGTGCCCGTGTGGAAGGACGCCGACACCACCCTGTTCTGCGC 
CGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGGTG 
AACCTGGAGAACGTGACCGAGGACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCACGA 
GGGACGACTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTC 

CTCCACCATGGAGGAGATC^GAACTGCTCCTACAACATCACCACCGAGCTOCGCGACAAG 
TTOTACAAGCTGGACGTGGTGCAGCTGGACGAGTCCAACAAGTCCGAGTACTACTACCGCCTG 
CCT^TCACCCAGGCCrrGCCCCy^GGTGTCCTTCGAGCCC^TCCCCATCCACTACr 
GTGCAAGGACCCCCGCTTCAACGGCACCGGCTCCTGCAACAACGTGTCCTCCGTGCAGT^ 

GCCTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGGCAAGGTGATGATCCGCTCCGAGAACATCACCAACAACGC 
AGAACATCATCGTGCAGTTCAACAAGCCCGTGCCCATCACCTGCATCCGCCCCAACAACAACACCCGCAAGTCCATCC 
CGGCCCCGGCCAGGCCTTCTACACGAACGACATCATCGGCGACATCCGCC^^ 
AACGCCACCCTGCAGAAGGTGGCCGAGCAGCTGCGCGAGCACTTCCCCAACAAGACC^ 

GCGACCTGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCACCGGCCTGTTCAACTCCAC 
CTGGAAGAACGGCACCACCAACAACACCGAGCAG ATGATCACCCTG C CCTG C CGCATCAAG CAGATCATCAACATGTGGC AG 
CGCGTGGGCCGCGCCATGTACGCCCCCCCCATCGCCGGCGTGATCAAGTGCIA^CT 
GCGACGGCGGGAACAACGAGACCGAGACCTTTCCGCCCCGGCGGCGGC^^ 

GTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCC 
GTGGGCATGGGCGCCGTGTTCCTGGGCTTCCTGGGCGCCGCCX3GCTCCACCATC 

AGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGAAGGCCATCGAGGCCCAGCAGCACC 
GCTGACCGTX3TGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCCTGGAGC 

ATCTGGGGCTGCTCCGGCAAGCTGATCTGCGCCACCACCGTGCCCTGGAACTCCTCCTGGTCCAACAAGACCCAGGAGG 

TCTGGAACAAGATGACCTGGCTGCAGTGGGAGAAGGAGATCTCCAACTACACCAACATC^ 

CCAGAACCAGCAGGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCAACCTGTGGAAC 

AACTGGCTGTGGTAGATCCGCATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGATCGCCATCATCTCCG 
TGGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGATCCCCACCCCCAACCCCGAGGGCCTGGACCGCCCCGG 
CCGCATCGAGGAGGGCGGCGGCGAGCAGGGCCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGAC 
GACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTGCATCCTGATCGCCGCCCGCACCGTGGAGCTGCTGG 
GCCACTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAAGTACCTGTGGAACCTGCTGCTGTACTGGGGCCGCGA 
GCTGAAGAACTCCGCCATCTCCCTGCTGGACACCATCGCCGTGGCCGTGGCCGAGTGGACCGACCGCGTGATCGAGATCGGC 
CAGCGCGCCTGCCGCGC<^TCCTGAACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTC 

L 2003 CON_B Env 

* MRVKG I RKNTYQKLWRWGTMLL.GMIJM ICSAAE KLWVTVYYGVPVWKEATTTLFCASDAiCAYDTK 

WXiENVTENFNMWKNKMVEQMHED 1 1 SLWDQ SLKPCVKLTPLCNTTXiNCTDLMNATNTNTT 1 1 YRWRGE I KNCS FN I TTS I RD 
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KVQKEYALFYKLDWPIDNDNTSYRLISCNTSVITQA 
GIRPWSTQLLLNGSIiAEEEWIRSENFTDNAKTIIVQ^^ 

I SRAKWNNTIiKQ I VKKLREQFGNKTI VFNQS SGGDPE I VMHSFNCGGBFFYCmTQLFNSTWNGTWNNTEGNITLPCRIKQI 
INMWQEVGKAMYAPPIRGQIRCS SNITGLI^TRD^ 

REKRAVG IGAMF LGFLGAAGSTMGAASMTLTVQARQLLSG I VQQQNNL.LRAI EAQQHLLQLTVWGI KQLQARVLAVERYLKD 
QQLI^IWGCSGKLICTTAvTVmASWSNKSIiD^ 

WFDITNWLWYIKIFIMIVGGIiVGLRIVFAVIiSIVNR\mQGYSPLSFQTRLPAPRGPDRPEGIEEEGGEra 

^IWDDLRSLCLFSYHRLRDLLLIVTRIV^L^ 

ACRAILHIPRRIRQGLERAIiIi$ 

2003 CON_B Env.seq.opt 

ATGCGCGTGAAGGGCATCCGCAAGAACTACCAGCACeTGTGGCGCTGGGGCACCATGCTGCTG 

CC^CCGCCGAGAAGCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCTC 

CGACGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 

GTGGTGCTGGAGAACGTGAC CG AGAACTTCAACATGTGG AAGAAGAACATGGTGG AGCAGATGCACGAGGACATCATCTCC C 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACCTGATGAACGCCAC 

CAACACCAACACGACCATCATCTACCGCTGGCGCGGCGAGATCAAGAACTGCTCCTTCAACATCACCACCTCCATCCGCGAC 

AAGGTGCAGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACGACAACACCTCCTACCGCCTGATCT 

CCTGCAAGACCTCCGTGATCACCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCCATCC 

CTTCGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCA^^ 

GGCATCCGCCCCGTGGTGTCC^CCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGGTGGTGATCCGCTCCGAGAACT 

TCACCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACCCGCCCCAACAACA 

CAAGTCCATCCACATCGGCCCCGGCCGCGCCTTCTAC^^ 

ATCTCCCGCGCCAAGTGGAACAACACCCTGAAGCAGATCGTGAAGAAGCTGCGCGAGCAGTTCGGCAACAAGACCATCGTGT 

TCAACCAGTCCTCCGGCGGCGACCCCGAGATCGTGATGCACTCCTTCAACTGCGGCGGCGAGTTCTO 

CCAGCTGTTCAACTCCACCTGGAACGGCACCTGGAACAACACCGAGGGCAACATCACCCTGC 

ATCAACATGTGGCAGGAGGTGGGtCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCTCCTCCAACATCACCG 

GCCTGCTGCTGACCCGCGACGGCGGCAACAACGAGACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCG 

CTCCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCGCGTGGTGCAG 

CGCGAGAAGCGCGCCGTGGGCATCGGCGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCA 

TGACCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCX^CATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCA 

GCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGGAGGCCCGCGTGC^ 

CAGC^GCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCA 

AGTCCCTGGACGAGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCTCCCTGATCTACAC 

CCTGATCGAGGAGTCCC^GAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCTCCCTGTGGAAC 

TGGTTCGACATCACCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCTGCGCATCGT 

TCGCCGTGCTGTCCATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCGCCTGCCCGCCCCCCGCGG 

CCCCGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCTCCGGCCGCCTGGTGGACGGCTTCCTG 

GCCCTGATCTGGGAC^ACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGACCCGCA 

TCGTGGAGCroCTGGGCCGCCGCGGCTGGGAGGTGCTGAAGTACTGGTGGA^ 

GAACTCCGCCGTGTCCCTGCTGAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCGTGATCGAGGTGGTGCAGCGC 
GCCTGCCGCGCC^TCCTGCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCXSCCCTGCTGT^ 

LfS 

2003 6. one Bnv 

MRVKG IRKNCQHLWRWGTMLIjGMIjMI CSAAENIjWVTVYYGVPVWKEIATT^ PNPQE 
WXiENVTENFNMWKNNMVEQMHED IIS L1WDQSL1KP CVKLTPLCVTIiNCTDLIiNATNTNSTNMYRWRGE I KNC S FN J TTS I RD 
KMQKEYALFYTOjDvVPIDNNTSYRIjINCNTSVITQACPKVSFEPIPIHYCT 

IRPWSTQLLLNGSLAEEEWIRSENFTDNAKTI IVQIiNESVEINCTRPNNITrRKSIHIGPGRAFYATGEI IGDIRQAHCNI* 
SRAKWNNTLKQWTKLREQFDNKTIV^ 

NMWQEVGKAMYAP P I RGQ I RC S SN I TG LLLTRDGGNNETE I FRPGGGDMRDNWRS EL YKYKWKI EPLGVAPTKAKRRWQR 
EKRAVG IGAMFLGFIjGAAG STMGAASMTIiTVQARQLL»S GI VQQQNNIiLRAI EAQQHLLiQIjTVWG I KQLQAR VTiA VERYLRDQ 
QLI/SIWGCSGKLICTTTVPVTOASWSNKSIiDEIWl^^ 

FD ITNWLWYI KI F IMI VGGLVGIiR I VFAVTjS IVNRVRQGYSPLSFQTRIiPAPRGPDRPEG lEEEGGERDRDRSGRLVNGFIiA 

LIWDDLRSLCLFSYHRLRDLLLIVARIVELIXSRRGWEALKYWWNLLQYW^ 

CRAILHIPRRIRQGLERALL$ 

2003 B.anc Env.seq.opt 

ATGCGCGTGAAGGGCATCCGCAAGAACTGCCAGCACCTGTGGCGCTGGGGCACGATG 

CCGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACC7VCCCTGTTCTGCGCCTC 



CGACGCCyU^GGCCTACGAGACCGAGGTGCAC^UVCGTGTGGGCCAC^ 
GTGGTGCTGGAGAACGTGACCGAGAACTTGAACATGTGGAAGAACAACATGGTGGAG 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACCTGCTGA 
CAACACCAACTCCACCAACATGTACCGCTGGCGCGGCGAGATCAAGAACTGCTCCTTCAACATCACCACCT 
AAGATGCAGAAGGAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACAACAAO 
GCAACACCTCCGTGATCACCCAGGCCTGCCCCAAGGTGTCCTTC^ 

CGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCA 

ATCCGCCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGGTGGTGATCCGCTCCGAGAACTTCA 

CCGACAACGCCAAGACCATCATCGTGCAGCTGAACGAGTCCGTGGAGATCAACTGCACCCGCCCC^^ 

GTCCATCCACATCGGCCCCGGCCGCGCCTTCTACGCCACCGGCGAGATCATCGGCGACATCCGCCAGGCCCACTGCAACCTG 
TCCCGCGCCAAGTGGAACAAGACCCTGAAGCAGGTGGTGACCAAGCTGCGCGAGCAGTTCGACAACAAG 

ACCCCTCCTCCGGCGGCGACCCCGAGATCGTGATGCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCACCCA 
GCTGTTCAACTCCACCTGGAACGGCACCTGGAACAACACCGAGGGCAACATCACCCTGCCCTGCCGCAT 

AACATGTGGCAGGAGGTGGGCAAGGCCATGTACGCCCCCCCCATCCGCGGCCAGATCCGCTGCTCCTCCAACATCACCGGCC 
TGCTGCrTOACCCGCGACGGCGGCAACAACGAGACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTC 
CGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACC^ 

GAGAAGCGCGCCGTGGGGATCX^CGCCATGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATGA 
CCCTGACCGTGCAGGCCCGCCAGCTGCrreTCCGGCATCGTGCAGCAGCAGAAClAA 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGCGCGACCA^ 

CAGCTGCTGGGGATCTGGGGCTGCTCCGGCLAAGCTGATCTGCAC^ 

CCCTGGACGAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGA 

GATCGAGGAGTCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCK3CTGGAGCTGGACAAGTGGGCCTCCCTGT 

TTCGACATCACGAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGGTGGGCCT 

CCGTGCrTOTCCATCGTGAACCGC^TGCGCCAGGGCTACTCCCCCCTGTCC?^ 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGACCGCTCCGGCCGCCTGGTGAACGGCTTCCTGGCC 

CTGATCTGGGACGACCTGCGCTCCCTGTGCCTGTTCn'CCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGGCC 

TGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGTCCCAGGAGCTGAAGAA 

CTTCCGCCGTGTCCCTGCrrGAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCGTGATCGAGGTGGTGCAGCGCGCC 

TGCCGCGCCATCCTGCACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 

2003 CON_C Bnv 

MRVRG ILRNCQQWWI WG ILGFWMLMI CNVVGNI*VTVTV^^ 
IVLENVTENFNMWKNDMVDQMHEDIISLVTO^ 
FYRIjDIVPLNENNSYRLINCNTSAITQACPKVSFDPIPIH^ 
LLI^GSIiAEEEIIIRSENLTNNAKTIIVHIjNESVEIVCTRPNNNTIUC^ 

LQKVS KKLKEHFPNKTI KFEPS SGGDIiE ITTHS FNCRGEFFYCNTSKLFNSTYNSTNSTITLPCRIKQIINMWQEVGRAMYA 
PPIAGNITCKSNITGLLLTRDGGKNNTETFRPGGGDMRD^ 
GFI^SAAGSTMGAASITLTVQARQIjLSGIVQQQSNIjIjRAIEAQQH^ 
ICTTAVPWNSSWSNKSQEDIWDNMTWMQWDREISNYTDTI 

FIMIVGGLIGLRIIFAVLSIVNRVRQGYSPLSFQTLTPNPRGPDRLGRIEEEGGEQDRDRSIRLVSGFLA^ 
SYHRLiRDFILIAARAVELIjGRSSIjRGL^RGWELALKYIjGSLVQYWGL ieliqri crair 

NIPRRIRQGFEAALQ$ 

20 03 CON_C Knv.seq.opt 

ATGCGCGTGCGCGGCATCCTGCGCAACTGCCAGCAGTGGTGGATCTC 

ACGTGGTGGGCAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAAGACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACGAGAAGGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
ATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACGACATGGTGGACC^G 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCAACGCCACCAACGCCAC 

CAACACCATGGGCGAGATCAAGAACTGCTCCTTCAACATCACCACCGAGCTGCGCGACAAGAAGCAGAAGGTGTACGCCCTG 

TTCTACCGCCTGGACATCGTGCCCCTGAACGAGAACAACTCCTACCGCCTGATCAACTGCAACACCTCCGCCATCACCC^ 

CCTGCCCCAAGGTGTCCTTCGACCCCATCCCCATCCACTAC^^ 

GACCTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGC^ 

CTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCT 

TGCACCTGAACGAGTCCGTGGAGATCGTGTGCACCCGCCCGAACAACAACACCCGCAAGTCCATCCGCATCGGCCCCGGCCA 
GACCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCTCCGAGGACAAGTGGAACAAGACC 
CTGCAGAAGGTGTCCAAGAAGCTGAAGGAGCACTTCCCCAACAAGACCATCA 
AGATCACCACCCACTCCTTCAACTGCCGCGGCGAGTTCT^CTACTGGAACACCTCCA 

CACCAACTCCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACATGTGGCAGGAGGTGGGCCGCGCCATGTACGCC 



CCCCCCATCGCCGGCAACATCACCTGCAAGTCCAACATCA 

AGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGGAGATCAAGCC 
CCTGGGCATCGCCCCCACCAAGGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCGl^TTCCTG 
GGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCC^T 

TCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACATGCTGCAGCTGACCGT 

GCTGCAGACCCGCGTGCTGGCCATCGAGCGCTACCTGAAGGACGAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTG 
ATCTGCACCACCGCCGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGGAGGACATCTTC 

* AGTGGGACCGCGAGATCTCCAACTACACCGACACCATCTACCGCCTGCTGGAGGACTCCCAGAACCAGCAGGAGA 
GAAGGACCTGCTGGCCCTGGACTCCTGGAAGAACCTGTGGAACTGGTTC 

TTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCGTGCTGTCCATCGTGAACCGCGTGCGCCAGGGCT 
- ACTCCCCCCTGTCCTTCCAGACCC^X^CCCCCAACCCCCGCGGCCCCGACCGCCTGGGCCGCATCGAGGAGGAGGGCGG 
GCAGGACTOCGACCGCTCCATCCGCCTGGTGTCCX^CTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTC 
TCCTACCACCGCCTGCGCGACTTCATCCTGATCGCCGCCCGCK3CCGTGGAGCTGCTGGGCCGCTCCTCCCTGCGCGGCCTGC 
AGCGCGGCTGGGAGGCCCTGAAGTACCTGGGCTCCCTGGTGCAGTACTGGGGCCTGGAGCTGAA 
GCTGGACACCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCAT 
AAflATCCCCCGCCGCATCCGCCAGGGCTTCGAGGCCGCCCTGCAGTAA 



2003 C.anc Env 

MRVMGILRNCQQWW I WG ILGFWMLM I CNWGNLWVTVYYGVP VWKEAKTTLFCAS DAKAYEREVHNVWATHACVPTDPNPQE 
MVLENVTENFNMWKNDMVDQMHEDI I SLWDQSIjKPCVKLTPIjCVTIjNCTNATNATNTMGEMKNCS FNITTEIiRDKKQKVYAIj 
FYRI^DIVPLNDNNSYRilNCNTSAITQACT 

IiLIiNGSLAEEEI I IRSENLTDNAKTI IVHLNESVE I VCTRPNNNTRKS IRIGPGQTFYATGDIIGDIRQAHCNISEEKWNKT 
LQRVGEKLKEHFPNKTIKFAPSSGGDIjEITTHSFNCRGEFFYC»^ 

pp I AGN I TCKSNI TGLLiIjTRDGGKNNTETFRPGGGDMRDNWRS ELYKYKWE I KPLG I APTEAKRRWEREKRAVG I GAVFL 
GFIXSAAGSTMGAASITLTVQARQLLSGIVQQQSNLLRAIEATO^ 

ICTTAVPWNSSWSNKSQEEIWDNMTWMQWDREISNYTDTIYRLLEDSQNQQEKNEQDL 
FIMIVGGLIGLRIIFAVTjSIViraVRQGYSPLSFQTLTPNPRGPD^^ 
SYHRIJU3FILIAARAVEIiIX3RSSLRGLQRGWEALKYIX3SL^^ 
NI PRRIRQGFEAALL $ 

2003 C.anc Bnv.seq.opt 

ATGCGCGTGATGGGCATCCTGCGCAACTGCCAGCAGTGGTGGATC 

ACGTGGTGGGCAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAAGAC 

CGACGCCJ^AGGCCTACGAGCGCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
ATGGTGCIX3GAGAACGTGACCGAGAACTTCAACATGTGGAAGAACGACATGGTGGACCA 

TGTGGGACCAGTCC CTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCAACG CCACCAACGCCAC 

CAACACGATGGGCGAGATGAAGAACTGCTCCrrTCAAGATCACCACCGAGCTGCGCGACAAGAAGCAGAAGGT^ 

TTCTACCGCCTGGACATCGTGCCCCTGAACGACAACAACTCCTACCGCCTGATCAACTC 

CCTGCCCCAAGGTGTCCTTCGACCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTACGCCATCCTGAAGTGCAACAACAA 
GACCTTC^CGGCACCGGCCCCTGCAAC^^CGTGTCC^CCGTG^^ 

CTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAACCTGACCGACAACGCCAAGACCATCATCG 

TGCACCTGAACGAGTCCGTGGAGATCGTGTGCACCCGCCCCAACAACAACACCCGCAAGTCCATCCGCATCGGCCCCGGCCA 

GACCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATCTCCGAGGAGAAGTGGAACAAGACC 

CTGCAGCGCGTGGGCGAGAAGCTGAAGGAGCACTTCCCCAACAAGACCATCAAGTTCGCCCCCTCCTCCGGCGGCGACC 

AGATCACC AC C CACTCCTTCAACTG C CG CGGCGAGTTCTTCTAC TGCAACAC CTCCCGCCTGTTCAACTCCACCTACAACTC 

CAAGAACTCCACCATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACATGTGGC^ 

CCCCCCATCGCCGGCAACATCACCTGCAAGTCCAACAT 

AGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGGAGATCAAGCC 

CCTX3GGCATCGCCCCCAGCGAGGCCAAGCGCCX5CGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCGTGTTCCT 

GGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCA 

TCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACATGCTGCAGCTGACCGTGTGGGGCATCAAG 

GCTGCAGACCCGCGTGCTGGCCATCGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTG 

ATCTGCAXTCACCGCCGTGCCCIXKSAACTCCT^ 

AGTGGGACCGCGAGATCTCCAACTACACCGACACCATCTACCGCCTGCTGGAGGACTCCCAGAACCAGCAGGAGAAGAACGA 

GCAGGACCTGCTGGCCCTGGACTCCTGGGAGAACCTGTGGAACTGGTTCGACATCACCAACTGGCTC 

TTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCGTGCTGTCCATCGTGAACCGCGTGCGCCAGG 

ACTCCCCCCTGTCCTTCCAGACCCTGACCCCCAACCCCCGCGGCCCCGACCGCCTGGGCCGCATCGAGGAGGAGGGCGGCGA 

GCAGGACCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTC 

TCCTACCACCGCCTGCGCGACTTCATCCTGATCGCCGCCCGCGCCGTGGAGCTGCTGGGCCGCTCCTCCCTGCX5CGGCCTGC 
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AGCGCGGCTGGGAGGCCCTGAAGTACCTGGGCTCCCTGGTGCAGTACTGGGGCCTGGAGCTGAAGAAGTCCGCCATCTCCCT 
GCTGGACACCATCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGCTGATCCAGCGCATCTGCCGCGCCATCCGC 
AACATCCCCCGCCGCATCCGCCAGGGCTTCGAGGCCGCCCTGCTGTAA 



2003 CON_D Bnv 

- MRVRG IQRNYQHIiWRWGI MLLGMLM I CS VAENLWVTVYYGVPVWKEIATTTIjFCASDAKS YKTEAHN I WATHACVPTDPNPQE 
IELENVTENFNMWKNNMVEQMHEDIISLWDQSLK^ 

* VHALFYKLDVVPIDDNNSNTSYRLINCNTSAITQACPKVTFEPIPI 
RPWSTQL.LLNGSLAEEEI I IRSENLTNNAKI 1 I VQLNES VT INCTRPYNNTRQRTP I GPGQALYTTRI KGD I RQAHCN I SR 
AEWNKTLQQVAKKIiGDLIjNKTTI IFKP 

- WQGVGKAMYAPPIEGLIKCSSNITGIjLIjTRDGGANNSHNETFRPGGGDMRDNV^ 

ekraiglgamflgflgaagstmgaasmti/ivqarqll^ 
qlixsiwgcsgkhictttvpwnsswsnksi^eiwnnmtwmewereidnytgl 

FS ITQWLWYIKI FIM IVGGL IGLRIVFAVIjSLVNRVRQGYSPLiSFQTIjIjPAPRGPDRPEGIEEEGGEQGRGRS irlvngfsa 

liwddijo^clfsyhrlrdliliaarivei^ 

CRAIIiNI PTR IRQGLERAIili $ 
2003 CON__D Bnv.seq.opt 

ATGCGCGTGCGCGGCATCCAGCGCAACTACCAGC^CCTGTGGCGCTGGGGCATCATC 

CCX3TGGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAA 

CGACGCCAAGTCCTAC^^GACCGAGGCCCACAACATCTGGGCCACCCACGCCT 

ATCG AGCTGGAGAACGTGAC CG AGAACTTCAACATGTGGAAGAACAACATGGTGG AGCAG ATG CACG AGGACAT CATCTCC C 
TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTG^ 

GACCTCCAACGACACGAACGAGGGCGAGATGAAGAACTGCTCCTTCAACATCACCACCGAGAT 
GTGCACGCCCTGTTCTACAAGCTGGACGTGGTGCCCATCGACGACAACT^CTCCAACACCTCCTAC^ 
ACACCTCCGCCATC^CCC^GGCCTGCCCCAAGGTGACCTTCGAGCC^ 
CATCCTGAAGTGCAAGGAGAAGAAGTTCAACGGCACC 

CGCCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAACCTGACCA 

ACAACGCGAAGATCATCATCGTGCAGCTGAACGAGTCCGTGACCATCAACTGCACCCGCCCCT 

C^CCCCGATCGGCCCCGGCCAGGCCCTGTACACC^CCCGCATCAAGGGC^ 

GCCGAGTGGAACAAGACCCTGCAGCAGGTGGCCAAGAAGCTGGGCGACCTGCrTGAACAAGACCACC^ 
CCTCCGGCGGCGACCCCGAGATC^CCACCGACTCCTTCAACTGCGGCGGC 

CAACTCCACCTGGAACAACACCAAGTGGAACTCCACCGGCAAGATCACCCTGCCCTGCCGCATCAAGCAGATCATCAACATG 
TGGCAGGGCGTGGGCAAGGCCATGTACGCCCCCCCCATCGAGGGCCTGATCAAGTGCTCCTCCAACATCACCGGCCTGCTGC 
TGACCCGCGACGGCGGCGCCAACAACTCCCACAACGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTC 
CGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGCCGCGTGGTGGAGCGC 
GAGT^GCGCGCCATCGGCCTGGGCGCCATGTTCCTGGGCT^CCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATGA 
CCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGAACAACCTGCTK3 

GCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCTGAAGGACCAG 
CAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCACATCTGCACCACCACCGTGCC 

CCCTGGACGAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGACAACTACACCGGCCTGATCTACTCCCT 
GATCGAGGAGTCCCAGAACCAGCAGGAGAAGAACGAGCAGGAGCTGCTGGAGCTGGACAAGTGGGCCTCCCTGTGGAACTGG 
TTCTCCATCACCCAGTGGCTGTGGTACATCAAGATCTTGATGATGATCGTGGGCGGCCTGATCGGCC^^ 

CCGTGCTGTCCCTGGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCTGCTGCCCGCCCCCCGCGGCCC 
CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCAGGGCCGCGGCCGCTCCATCCGCCTGGTGAACGGCTTCTCCGCC 
CTGATCTGGGACGACCroCGCAACCTGTGCCTGTTCTCCTACCACCGCCTO 

TGGAGCTGCTGGGCCGCCGCGGCTGGGAGGCCCTGAAGTACCTGTGGAACCTGCTGCAGTACTGGATCCAGGAGCTGAAGAA 
CTCCGCCATCTCCCTGTTCGACACCACCGCCATCGCCGTGfGCCGAGGGCACCGACCGCGTGATCGAGATCGTGCAGCGCGCC 
.TGCCGCGCCATCCTGAACATCCCCACCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 



2003 CON_Fl Exiv 

MRVRGMQRNVJQHLGKWGLIjFIjG I L 1 1 CNAAENIj WVTVYYGVP VWKEATTTL FCASDAKS Y^KEVHNVWATHACVPTD PNPQE 
WLiENVTENFDMWKNNMVEQMHTDI I SLWDQSLKPCVKLTPLCVTTjNCTDVNATNNDTNDNKTGAI QNCS FNMTTEVRDKKLi 
KVHAIiFYKLDIVPISNNNSKYRLINO^STITQACPKVSWDPIPIHYCAPAGYAILKCNDKR 

PWSTQLLLNGSLAEEDI I IRSQNI SDNAKTI IVHLNESVQI NCTRPNNNTRKS I HIX3 PGQAFYATGE 1 1 GD I RKAHCN I SG 
TQWNKTLEQVKAKLKSHFPNKTIKFNSSSGGDLEITMHSFNCRGEFFYCNTSGIjFNDTGSNG 
AMYAAPIAGNITCNSNITGIibljTRDGGQNNTETFRPGGGNMKDNWRSELYKYKVVEIEPIXSV 
AVFIX3FLGAAGSTMGAAS ITLTVQARQLLSGI VQQQNNLLRAIEAQQI^ 

SGKXiICTTNVPWNSSWSNKSQDEIWNNMTWMEWEKEISNYSNI IYRL»IEESQNQQEKNEQELLA1»DKWAS LWNWFDISNWLW 
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Y I KI F IMI VGGL IGLRI VFAVLS I VNRVRKGYS PL SLQTLI PS PREPDRPEG I EEGGGEQGKDRSVRL.VNGFLALVWDDLRN 

LCLFSYRHLRDFILIAARIVDRGLRRGWEAIiKYIjGNIjTQYWSQELKNSAISLLOT 

RRIRQGLiERAIjL$ 

2003 CONFl Env.aeq.opt 

ATGCGCGTGCGCGGCATGCAGCGCAACTGGCAGCACCTGGGCAAGTGGGGCCTGCTGTTCCTGGGC^ 

ACGCCGCCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCTC 

* CGACGCCAAGTCCTACGAGAAGGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
GTGGTGCTGGAGAACGTGACCGAGAACTTCGAGATGTGGAAGAACAACATGGT^ 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGTGAACGCCACCAA 

• CAACGACACCAACGACAACAAGACCGGCGCCATCCAGAACTGCTCCTTCAACATGACCACCGAGGT 
AAGGTGCACGCCCTGTTCTACAAGCTGGACATCGTGCCCATCTCCAACAACAACTCCAAGTACCG 
CCTCCACCATCACCCAGGCCTGCCCCAAGGTGTCCTGGGACC^ 

CCTGAAGTGCAACGACAAGCGCTTCAACGGCACCGGCCCCTGCAAGAACGTGTCC^ 

CCCGTGGTGTCCACC CAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGACATC ATCATC CGCTCCCAGAACATCTCCGACA 
ACGCCAAGACCATCATCGTGCACCTGAACGAGTCCGTGCAGATGAACTGCAC 

CCACCTGGGCCCCGGCCAGGCCTTCTACGCCACCGGCGAGATCATCGGCGACATCCGCAAGGCCCACTGCAACATCTCCGGC 

ACCCAGTGGAACAAGACCCTGGAGCAGGTGAAGGCCAAGCTGAAGTCCClACrrTCCCCAA 

CCTCCGGCGGCX1ACCTGGAGATCACGATGCACTCCTTCAACTGCCGCGGCGAGTTCTTCTACTGCAACAC 

CAACGACACCGGCTCCAACGGCACCATCACCCTGCCCTGCCGCATCAAGCAGATCGTGAACATGTGGCAGGAGGTGGGCCGC 

GCCATGTACGCCGCCCCCLATCGCCGGCAAC^TCACCTGCAACTCCAACATCACCGGCCTC 

AGAACAACACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGT 

GGAGATCGAGCCCCTGGGCGTGGCCCCCACCAAGGCCAAGCGCCAGGTGGTGAAGCGCGAGCGCCGCGCCGTGGGCATCGGC 

GCCGTGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGC 

TGCTGTCCGGCATCGTGCAGCAGCAGAACAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGA 

GGG CATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTA 

TCCGGCAAGCTGATCTGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGGACGAGATCTGGAACAACA 
TGACCTGGATGGAGTGGGAGAAGGAGATCTrCCAACTACTCCAACATCATCTACCGCCTGATCGAGGAGTCCCAGAACCAGCA 
GGAGAAGAACGAGCAGGAGCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCGACAT 
TACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGC^ 

TGCGCAAGGGCTACTCCCCCCTGTCCCTGCAGACCCTGATCCCCTCCCCCCGCGAGCCCGACCGCCCCGAGGGCATCGAGGA 
GGGCGGCGGCGAGCAGGGCAAGGACCGCTCCGTGCGCCTGGTGAACGGCTTCCTGGCCCTGGTGTGGGACGACCTGCGCAAC 
CTGTGCCTGTTCTCCTACCGCC^CCTGCGCX^CTTCATCCTGATCGCCGCCCGCAT^ 

GGGAGGCCCTGAAGTACCTGGGCAACCTGACCCAGTACTGGTCCCAGGAGCTGAAGAACTCCGCCATCTCCCTGCTGAACAC 
CACCGCGATCGTGGTGGCCGAGGGCACCGACCGCGTGATCGAGGCCCTGCAGCGCGCCGGCCGCGCCGTGCTGAACATCCCC 
CGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 

- 2003 CONF2 Bnv 

T MRVREMQRNWQHLfGKWGIiLFLG ILI I CNAADNIiVfVTTVYYGVPVWKEATTTLFCASDAKAYEREVHNVWATYACV^ PQE 
LVLGNVTENFNMWKNNIWDQMH^ 

EYALFYRIiDWPINNSIVYRIilSCNTSTVTQACPKVSFEPIPI 

VSTQLLLNGSLAEEDI I IRSENISDNTKTI I VQFNRSVEINCTRPNNNTRKS IRIGPGRAFYATGDIIGDIRKAYCNINRTL 

WNETLKKVAEEFKNHFNITVTFNPSSGGDLEITTHSFNCRGEFFYCiro 

MYAPPIAGQIQCNSNITGLLLTRDGGKNGSETIjRPGGGDMRDNWRSEIiYKYTCVVKIEPL^ 

VLLGFLGAAGSTMGAASITLTVQARQLLSGIVQQQSNLLKAIE^ 

GKLICTTNVPWNSSWSNKSQDEIVTONMTWMQW 

IKIFIMIVGGLIGLRIVFAVT,SVVNRVRQGYSPIjSLQTIjIPNPRGPERPGGIEEEGGE 
CLFSYRHIiRDFILIAARTVDMGLKRGVfEAIjKYIjWNIjPQYWGQEIjKNS 
R I RQGFE RAIjLi $ 

2003 CON_F2 Env.aeq.opt 

ATGCGCGTGCGCGAGATGCAGCGCAACTGGCAGCACCTGGGCAAGTGGGGCCTGCTGTTCCTGGGCATCCTC 
ACGCCGCCGACAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCT^TTCTGCGCCTC 
CGACGCCAAGGCCTACGAGCGCGAGGTGCACAACGTGTGGGCCACCTACGCCTGCGTGCCCACCGACCCCTCCCCCCAGGAG 
CTGGTGCTGGGCAACGTGACCGAGAACTTCAACATGTGGAAGAACAAC 

T^TGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGTGAACGTGACO 
CAACACCACCAACGTGACCCTGGGCGAGATCAAGAACTGCTCCTTCAACATCACCACCGAGATCAAGGACAAGAAGAAGAAG 
GAGTACGCCCTGTTCTACCGCCTGGACGTGGTGCCCATCAACAACTrCCATC^ 
CCX3TGACCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCCATC 
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GTGCAACGACAAGAAGTTCAACGGCACCGGCCTO 
GTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGA 

AGACCATCATCGTGCAGTTCAACCGCTCCGTGGAGATCAACTGCT^CCCGCCCCAACAACAACACCCGCAAGTCCA 

CGGCCCCGGCCGCGCCTTCTACGCCACCGGCGAC^TGATCGGCGACATC 

TGGAACGAGACCCTGAAGAAGGTGGCCGAGGAGTTCAAGAACCACTTCAACATCACCGTC 

GCGACCTGGAGATCACCACCCACTCCTTCAACTGCCGCGGCGAGTTCTTCTACTGCAACACCTCCGACCTGTTQ 
CGAGGTGAACAACACCAAGACCATCACCCT'GCCCTGCCGCATCCGCCAGTTCGTGAACATGTGGCAGCGCGTGGGCCGCGCC 
ATGTACGCCCCCCCCATCGCCXK3CCAGATCCAGTGCAACTCCAACATCACCGGCCTGCTGCTGACCCGCGACGGCGGCAAGA 
ACGGCTCCGAGACCCIXX:GCCCCGGCGGCX3GCX3ACATGCGCGACAACT^GCGCTC 

GATCGAGCCCCTGGGCGTGGCCCCC^CGAAGGCCAAGCGCCAGGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCC 
GTGCTGCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGAC^ 

TGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGAAGGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGG 

CATCAAGCAGCTGCAGGCCCGCATCCTGGCCGTGGAGCGCTACCrrGAAGGACCAGCAGCTGCl^ 

GGCAAGCTGATCTGCACCACCAACGTGCCCTGGAACTCCTCCTGGTCCAACAA 

CCTGGATGCAGTGGGAGAAGGAGATCTCCAACTACACCGACACCATCrrACCGCCTGATCGAGGACGCCCAGAACCAGC^ 

GAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGACAACCT 

ATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCAT 

GCCAGGGCTACTCCCCCCTGTCCCTGCAOACCCTGATCCCCAACCCCCGCGGCCCCGAGCGCCCCGGCGGCATCGAGGAC^ 

GGGCGGCGAGCAGGACCGCGACCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTG 

TGCCTGTTCTCCTACCGCCACCTGCGCGACTTCATCCTGATCGCCGCCCGCACCGTGGACATGGGCCTGAAGCGCGGCTGGG 

AGGCCCTGAAGTACCTGTGGAACCTGCCCCAGTACTGGGGCCAGGAGCTGAAGAAGTCCGCCATCTCCCTGCTC 

CGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGCTGCAGCGCGCCGGCCGCGCCGTGCTGCACATCCCCCGC 

CGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 

/ 

2 003 CON G Env 

MRVKG I QRNWQHLWKWGTL I LGLVI I CS ASNNLWVTVYYGVP^ 

I TLENVTENFNMWKNNMVEQMHED IIS LVTOESLKPCVKLTPLCVTI^CTDVNVTNNNTNNTKKE IKNCS FNITTE I RDKKKK 
EYALFYRIaDWPINDNGNSSIYRLINCNVSTIKQACPKV^ 

KPWSTQLLIiNGSLABEEI I IRSENITDNTKVI IVQLNETIEINCTRPNNNTRKS I R I GPG QAF YATGD 1 1 GD I RQAHCNV S 

RTKWNEMIjQKVKAQLKKIFNKSITFNSSSGGDL^ 

VGQAMYAPPIAGNITCRSNITGLIjIjTRDGGNNNTETFRPGG^^ 

GLGAVLLGFLGAAGSTMGAASITLTVQVRQLLSGIVQQQSNLIjRA^ 

WGCSGKLICTTNVTVmTSWSNKSYNEIWDNMTWIEWEra 

WLWYIKIFIMIVGGLIGLRIWAVLSIVNRVRQGYSPLSFQTLTm 

IJ*SLCLFSYHRIJU5FIIiIAARTVEU,GRSSLKGLRI^^ 

RACRAILNIPRRIRQGLERAIjIj$ 

2003 CON G Env.seq.opt 

ATGCGCGTGAAGGG<^TCCAGCGOVACTGGCAGCACCTGTGGAAGTG^ 
CGACGCCAAGGCCTACTCCACCGAGCGCC^C^^CGTGTGGGCCACC 

ATCACCCTGG AGAACGTGAC CGAG AACTTCAACATGTGG AAGAACAACATGGTGGAGCAG ATG CACG AGGACATCATCTC CC 

TGTGGGACGAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGTGAACG 

CAACAACACGAACAACACCAAGAAGGAGATCAAGAACTGCTCCTTCAACAT 

GAGTACGCCCTGTTCTACCGCeTOGACGTGGTGCCCATCAACGACAACGGCAACTCCTCC^ 

AOTTGTCCACCATCy^GC^GGCCTGCCCCAAGGTGACCTTCGAC^ 

CATCCTGAAGTGCCGCGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCGACCGTG 

AAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGC^CCGAGAACATCACCG 

ACAACACCAAGGTGATCATCGTGCAGCTGAACGAGACCATCGAGATCAACTGCACC 

CATCCGC7VTCGGCCCCGGCCAGGCCTTCTACGCCACC 

CGCACCAAGTGGAACGAGATGCTGCAGAAGGTGAAGGCCCAGCTGAAGAAGATCTC 

CCTCCGGCGGCGACCTGGAGATCACCACCCACTCCTTCAACTGCCGCGGCGAGTTCTTCTACTGC^CACCTCCGGCCTGTT 
CAACAACTCCCTGCTGAACTCCACCAACTCCACCATCACCCTGCCCTGCAAGATCAAGCAGATCGTGCGCATGTGGCAGCGC 
GTGGGCCAGGCCATGTACGCCCCCCCGATCGCCGGCAACATCACCTGCCGCTCO\ACA 

ACXK3CGGCAACAACAACACCGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTA 
CAAGATCGTGAAGATCAAGCCCCTGGGCGTGGCCCCCACCCGCGCCCGCCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTG 
GGCCTGGGCGCCGTGCTGCTGGGCTTCCTGGGCGCCGCCG^ 

TGCGCCAGCTGCTGTCCGGCATCGTGGAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCT 
GACCGTGTGGGGCATCAAGGAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAG 
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TGGGGCTGCTCCGGCAAGCTGATCTGCACCACCAACGTGC 

GGGACAACATGACCTGGATCGAGTGGGAGCGCGAGATCTCCAACTACACCCAGCAGATCTACTCCCTGATCGAGGAGTCCCA 
GAACCAGCAGGAGAAGAACGAG CAGGACCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTG^AACnX3GTTCGACATCAC CAAG 
TGGCTX3TGGTACATCAAGATCTTGATCATGAT 

TGAACCGCG1X3CGCCAGGGCTACTCCCCCCTGTCCTTCGAGACCCTGACCCACCACCAGCGCGAGCCCGACCGCCCCGAGCG 
CATCGAGGAGGGCGGCGGCGAGCAGGACAAGGACCGCTCCATCCG 

CTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTTCATCCTGATCGCCGCCCGCACCGTGGAGCTGCTGGGCC 
GCTCCTCCCTOAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAAGTACCTGTGGAACCTGCTGCTGTACTGGG 
GAAGAACTCCGCCATCAACCTGCTGGACACCATCGCCATCGCCGTGGCCIAACTGGACCGACCGCGTGATCGAGGTGGCCCAG 
CGCGCCTGCCX3CGCC^TCCTGAACATCCCCCGCCGC^TCCGCC^GGGCCTGGAGCGCX3CCCTC 



TRVMETQRNYPSLWRWGTI,IIX3MIiDICSAAGNIiWV^ 



MVLENVTTENFNMWENDMVEQMHTM 
QKVHALFYRLDVVPIDDNNSYQYRLiINCNTSVITQACPKVS FEPIPIHYCAPAGFAILiKCNNKTFNGTGPCTNVSTVQCTHG 
IRPVVSTQLLLNGSLAEEQVIIRSKNISDNTKNIIVQLNKPVEITCTRPNNNTR 
SGKKWNKTLHQVVTQIjGKYFDNRTIIFKPHSGGDMEVTTHSFNCRGEFFY 
N^QRVGQAMYAPPIKGNITCVSNITGLILTFDEGNNTVTFRPGGGDM^ 

KRAVGMGAFFL1GFL1GAAGS TMGAAS ITLTVQARQIjIiSG I VQQQ SNXjLRAI QAQQHMLQLTVWG I KQIjQARVIjAVB RYLKDQQ 
IiLGIWGCSGKIilCTTNVPWNSSWSNKSIJDEIWDNMTWMEWDKQINNyTEEIYRL 

SITNWIjWYIKIFIMIVGGLIGIjRIIFAVIjSIVNRVRQGYSPLSFQTLIPNPRGPDRPEGIEEEGGEQDRDRSVRLVNGFIjPL 

VWDDLRSLCLFSYRIiliRDLLLIVVRTVELLGRRGREAL 

RAILHIPRRIRQGFERTLI>$ 

2003 CON_H Env.seq.opt 



ACCCGCGTGATGGAGACCCAGCGC^^CTACCCCTCCCTGTGGCGCTC 

CCGCCGCCGGCAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAAGACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACGAGACCGAGAAGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
ATGGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGGAGAACGACATGGT^ 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGGACTGCTCCAACGTGAACACCACCAA 

CGCCACCAACTCCCGCTTCAACATGCAGGAGGAGCTGACCAACTGCTCCTTCAACGTGACCACCGTGATCCGCGACAA^ 

CAGAAGGTGCACGCCCTGTTCTACCGCCTGGACGTGGTGCCCATCGACGACAACAACTCC^ACCAGTACCGCCTGATCAAC^ 

GCAACACCTCCGTGATCACCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCGATCCCCATCCACTACTGCGCCCCCGCC 

CGCCATCCTGAAGTGCAACAAXZAAGACCTTCAACGGCACC 

ATCCGCCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGCAGGTGATGATCCGCTrCCAAGAACATCT 

CCGACAACACGAAGAACATCATCGTGCAGCTGAACAAGCCCGTGGAGATCACCTGCACCCGCCCCAACAACAACACC 

GTCCATCCACCTGGGCCCCGGCCAGGCCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACATC 

TCCGGCAAGAAGTGGAACAAGACCCTGCACCAGGTGGTGACCCAGCTGGGCAAGTACTTCGACAACCGCACCA 

AGCCCCACTCCGGCGGCGACATGGAGGTGACCACCCACTCCTTCAACTGCCGCGGCGAGTTCTTC 

CCTGTTCAACTCCTCCTGGACCAACTCCACCAACGACACCAAGAACATCATCACCCTC 

AACATGTGGCAGCGCGTGGGCCAGGCCATGTACGCCCCCCCCATCAAGGGCAACATCACCTGCGTGTCCAACATCACCGGCC 
TGATCCTGACCTTCGACGAGGGCAACAACACCGTGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGA 
GCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCGAGGCCCGCCGCCGCGTGGTGGAGCGCGAG 
AAGCGCGCCGTGGGCATGGGCGCCTTCTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCC 
TGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTC 

CATGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACC 
CTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACC^ 

TGGACGAGATCTGGGACAACATGACCTGGATGGAGTGGGACAAGCAGATCAACAACTACACCGAGGAGATCTACCGCCTGCT 
GGAGGTGTCCCAGACCCAGCAGGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCTCC 

TCCATGACCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCG 
TGCTGTCC^TCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTC^ 

CCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCAGGACCGCGACCGCTCCGTGCGCCTGGTGAACGGCTTCCTGCCCCTG 
GTGTGGGACGACCTGCGCTCCCrraTGCCTGTTCTCCTACCGCCTGCTGCGCGACCTGCTO 

AGCTGCTGGGCCGCCGCGGCCGCGAGGCCCTGAAGTACCTGTGGAACCTGCTGCAGTACTGGGGCCAGGAGCTGAAGAACTC 
CG CCATCAACCTGCTGAACAC CACCG CCATCGCCGTGG C CGAGGGCACCG ACCG CATCATCGAG ATCGTGCAG CGCGCCTGG 
CGCGCCATCCTCCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCACCCTGCTGTAA 




2003 CON H Env 





2003 CON OX AB Env 
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MRVIOBTQMNWPNIjWKWGTLILGL^ 

IHLENVTENFNMWraTOMVEQMQED^ 

ELRDKKQKVHALFYKIjDIVQIEDNNSYRI.INCNTSV^ 

CTHGI KPVVSTQLiLiLiNGSLtAEEEI I IRSENI/TNNAKTI IVHLNKSVEINCTRPSNNTRTS ITIGPGQVFYRTGDI IGDIRKA 

YCEINGTKWNEVLKQVTEKLKEHFNNKTIIFQPPSGGDLEITMHHra 

KIKQIINMWQGAGQAMYAPPISGRINC^SNITGILI/IM 

KRRWEREKRAVG IGAMI FGFLGAAG STMGAAS ITLTVQARQLIjSG I VQQQSNIjIiRAI EAQQHLiLQLTVWG I KQLQARVI*AV 
" ERYLKDQKFIX3LWGCSGKIICTTAVPWNSTWS3TOSFEEIWNNMTWIEWEREISNYTO 
WASLWNWFDITNWLUT¥IKIFIMIVGGIiIGIjRIIFAVIjSIVNRVRQG 
LVSGFLAIiAWDDLRSLCLFSYHRLRDFIIilAARTVEL^ 
- AGWTDRVI EVAQGAWRAI L»H I PRRIRQGLiERAIjIj $ 

2003 CON_01_AB Bnv.seq.opt 

ATGCGCGTGAAGGAGACCCAGATGAACTGGCCCAACCTGTGGAAGTG^ 

CCGCCTCCGACAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGCGCGACGCCGACACC^ 

CGACGCCAAGGCCCACGAGACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTO 

ATCCACCTGGAGAACGTGACCGAGAACTTCAACATGTGGAA 

TGTGGGACGAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACC 

CGTGAACAACATCACCAACGTGTCCAACATCATCGGCAACATCACCAACGAGGTGCGCAACTG 

GAGCTGCGCGACyVAGAAGCAGAAGGTGCACGCCCTGTTCTACy^GCTG 

GCCTGATCAACTGCAACACCTCCGTGATCAAGCAGGCCTGCCCCAAGATCTCCTTCGACCCCATCCCCATCCACTACTGCAC 
CCCCGCCGGCTACGCCATCCTGAAGTGCAACGACAAGAACTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCTCCGTGC^ 
TGCACCCACGGCATCAAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCT 
CCGAGAACCTGACCAACAACGCCAAGACCATCATCGTGCACCTGAAGAAGTC 

CAACACCCGCTVCCTCCATCACCATCGGCCCCGGCCAGGTGTTCTACCGCACCGGCGAGATCATCG^ 

TACTGCGAGATCAACGGCACCAAGTGGAACGAGGTGCTGAAGCAGGTGACCGAGAAGCTGAAGGAGCACTT 

CCATCATCTTCCAGCCCCCCTCCGGCGGCGACCTGGAGATCACCATGCACCACTTCAACTGCCGCGGCGAGTTCTTCTA 

CAACACGACCAAGCTGTTCAACAACACCTGCATCGGCAACGAGACCATGGA 

AAGATCAAGGAGATCATCAACATGTGGCAGGGCGCCGGCCAGGCCATGTACGC 

TGTCCAACATCACCGGCATCCTGCTGACCCGCGACGGCGGCGCCAACAACACCAACGAGACCTTC 

CATCAAGGACAACTGGGGCTCCGAGCTGTACAAGTACAAGGTGGTGCAGATCGAGCCCCTGGGCATCGCCCCCACCCGCGCC 
AAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCATGATCTTCGGCTTCCTGGGCGCCGCCGGCTCCA 
CCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCT 
GCGCGCCATCGAGGCCCAGCAGCACCTGCTGCAGCTGACCGTGTGGGGCATCAAGCAG 

GAGCGCTACCTGAAGGACCAGAAGTTCCTGGGCCTGTGGGGCTGCTCCGGCAAGATCATCTGCACCACCGCCGTGCCCTGGA 
ACTCCACCTGGTCCAACCGCTCCTTCGAGGAGATCTGGAACAACATGACCTGGATCGAGTGGGAGCGCGAGATCTCCAACTA 
CACCAACCAGATCTACGAGATCCTGACCGAGTCCCAGAACCAGCAGGACCGCAACGAGAAGGACCTGCTGGAGCTGGACAAG 
TGGGCCTCCCTGTGGAACTGGTTCGACATCACCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGA 
TCGGCCTGCGCATCATCTTCGCCGTGCTGTCCATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCC 
CACCCACCACCAGCGCGAGCCCGACCGCCCCGAGCGCATCGAGGAGGGCGGCGGCGAGCAGGGCCGCGACCGCTCCGTGCGC 
CTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTTCA 
TCCTGATCGCCGCCCGCACCGTGGAGCTGCTGGGCCACTCCTCCCTGAAGGGCCTGCGCCGCGGCTGGGAGGGCCTGAAGTA 
CCTGGGCAACCTGCTGCTGTACTGGGGCCAGGAGCTGAAGATCTCCGCCATCTCCCTGCTGGACGCCACCGCCATCGCCGTG 
GCCGGCTGGACCGACCGCGTGATCGAGGTGGCCCAGGGCGCCTGGCGCGCCATCCTGCACATCCCCCGCCGCATCCGCCAGG 
GCCTGGAGCGCGCCCTGCTGTAA 

2003 CON 02 AG Env 

MRVMGIQKNYPLLWRWGMI IFWIMI ICNAENLWVTVYYGVPVWRDAETTLFCASDAKAYDTEVHNWAT^ 
T HLENVTENFNMWKNNMVEQMHED IIS IjWDQSLKPCVKLTPLiCVTIjDCHNN I TNSNTTNNNAGE I KNCS FNMTTELiRDKKQKV 
YAIiFYRLD WQ INKNNS QYRLINCNTS AI TQACPKVS FEP I P IHYCAPAGFAI LKCNDKEFNGTGPCKNVSTVQCTHGIKPV 
VSTQIiLLNGSIiAEEE IVIRSENITNNAKTI I VQLVKPVKINCTRPNNNTRKS VR I GPGQTFYATGD I IGDIRQAHCNVSRTK 
WNNTLQQVATQLRKYFNKTIIFANPSGGDIiEITTHSFNCGGEFFYCNTSELFNST^ 
VGQAMYAPPIQGVIRCESNITGLLLTRIXXSNNNSTN^^ 

AVGLGAVFLiGFLiGAAGS TMGAAS I TLiTVQARQLIjSG I VQQQ SNLLRAI EAQQHLLiKLTVWG I KQLQARVLAIiER YtiKDQQLLi 
GIWGCSGKLICTTTVPWNSSWSNKTYNDIWDNMTWIjQWDKEISOT 

TNWLWYIKIFIMIVGGLIGIjRIVFAVIiTIINRVRQGYSPLSFQTLTHHQREPDRPERIEEGGGEQDRDRSVRIjVSGF 

DDLRSLCLFS YHRIoRDFVLI AARTVELLGHS SLKGI^^ 

GQRAGRAILNIPRRIRQGLERAliIi$ 
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2003 CON_02_AG Bnv.seq.opt 

ATGCGCGTGATGGGCATCCAGAAGAACTACCCCCTGCTGTGGCGCTGGGGC^ 

ACGCCGAGAACCTGTGGGTGACCGTGTACTACGGCK3TGCCCGTGTGGCGCGACGCCGAGACCACCCTGTTCTGCGCCTCCGA 

CGCCAAGGCCTACGACACCGAGGTGCACAACGTGTGGGGCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAGATC 

GACCTXK3AGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGCAC 

GGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGGACTGCCACAACAACATC^ 

CACCACCAACAACAACGCCGGCGAGATCAAGAACTGCTCCTTCAACATGACCACCGAGCTC 

* TACGCCCTGTTCTACCGCCTGGACGTGGTGCAGATCAACAAGAACAACTCCCAGTACCGCCTGATCAACTGCAAC 
CCATCACCCAGGCCTGCCCCAAGGTGTCCTTCGAGCCGAT 

GTGCAACGACAAGGAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCC^ 

• GTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCGTGATCCGCTCCGAGAACATCACCAACAACGCCA 
AGACCATCATCGTGCAGCTGGTGAAGCCCGTGAAGATCAACTGCACCCGCC 

CGGCCCCX3GCCAGACCTTCTACGCCACCGGCGACATCATCGGCGACATCCGCCAGGCCCACTGCAACGTGTCCCGCACCAAG 
TGGAACAACACCCTGCAGCAGGTGGCCACCCAGCTGCGCAAGTACTTCAACAAGACCATCATC 
GCGACCTGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTC 
CTGGAACTCCACCTGGAACAACACCGAGAAGTGCATGACCC^ 

GTGGGCCAGGCCATGTACGCCCCCCCCATCCAGGGCGTGATCCGCTGCGAGTCCAACATCACCGGCCTGCTGCTGACCCGCG 

ACGGCGGCAACAACAACTCCACCAACGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGKaCGCTCCGAGC 

CAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCC 

GCCGTGGGCCTGGGCGCCGTGTTCCT^K^CTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCG 

TGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACCTGCT 

GAAGCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCCTGGAGCGOTACCT 

GGGATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCACCGTGCCCT^ 

ACATCTGGGACAACATGACCTGGCTGCAGTGGGACAAGGAGATCT 

GTCCCAGAACCAGCAGGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGGTTCG 

ACCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGAT 

CCATCATCAACCGCGTGCGCCAGGGCTACTCCCCCC 

CGAGCGCATCGAGGAGGGCGGCGGCGAGCAGGACCGCGACCGCTCCGTGCGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGG 
GACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACTTCGTGCTGATCGCCGCCCGCACCGTGGAGCTGC 
TGGGCCACTTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGCCCT^A 

GGAGCTGAAGAACTCCGCCATCAACCTGCTGGACACCATCGCCATCGCCGTGGCCAACTGGACCGACCGCGTGATCGAGATC 
GGCCAGCGCGCCGGCCGCGCCATCCTGAACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 

ss 

2003 CON_03_AB Bnv 

Or MRVKE I RKHLWRWGTLFLGMLM I CS ATENLWVTVYYGVT>WKEATTTLFCASDAKAYS KKVHNVWATYACVPTDPSPQE I PL 
1 ENVTENFNMGKNNMVEQMHEDI I SL WDQS LKPCVKXiTPljCVTLNCTDLKiQJVTSTNTS S I KMMEMKNCS FNI TTDLRDKVKK 
EYALFYKI^WQIDNDSYRLISCOTSVVTQACP 

STQLLLNGSLAEEEWIRSVNFTDNTKTI I VQLKE P VE I NCTRPNNNTRKG I H I GPGRAFY ATGD I IGDIRQAHCNISITKW 
NNTLKQ I VI KLRKQ FGNKT I VFNQS SGGDPE I VMHS FNCGGE FFYCNTTKLFNSTWNGTEELNNTEGD I VTIj PCR I KQI INM 
WQEVGKAMYAPP I AGQ I RCS SN I TGLLLTRDGGNQSNVTE I FRPGGGDMRDNWRS EL YKYKVVKI EPIjGVAPTKAKRRVVQR 
BKRAVG IGAVFLGF LGAAGS TMGAAS I TLTVQARQLL SG I VQQQNNLLRAI EAQQHLLQDTVWG I KQLQARVLAVER YLKDQ 
QLLGIWGCSGKLICTTAVPWNTSWSNKSIJ5EIWNNMTWMEWEREINNYTGLIYNLIEES 

FDISKWLWYIKIFIMIVGGLVGLRIIFAVLSIVNRVRQGYSPLSFQTRLPTQRGPDRPEGIEEEGGERDRDTSIRLW 

LIWDDLRSLCLFIYHHIjRDLLLIAARIVEIjLiGRRGWEALKYWWNLLQYWIQELKBSA^ 

CRAIRNIPRRIRQGAEKALQ$ 



2003 CON_03_AB Env.seq.opt 

ATGCGCGTGAAGGAGATCCGCAAGCACCTG1XK3CGCTGGGGCACCCTGTTCCTGGGCATGCTGATGATCTGCTCCGCCACCG 
AGAACCTGTGGGTGACCGTGTAC^ACGGCGTGCCCGTGTGGAAGGAGGCCACCACC^CCCTGTTCTGCGCCTCCGACGCCAA 
GGCCTACTCCAAGGAGGTGCACT^CGTGTGGGCCACCTACGCCTGCGTGCCCACCGACCCCTCCCCCCAGGAGATCCCCCTG 
GAGAACGTGACCGAGAACTTCAACATGGGCAAGAACAACATGGTGGAGCAGATGCACGAGGACATCATCTCCCTGTGGGACC 
AGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACCTGAAGAAGAACGTGACCTCCAC 
CAACAC CTCCTC C ATCAAG ATGATGGAGATG AAGAACTGCTCCTTCAAC ATCACGACC G ACCTGCGCG ACAAG TGAAGAAG 
GAGTACGCCCTGTTCTACAAGCTGGACGTGGTGCAGATCGACAACGACTCCTACCGCCTGATCTCCTGCAACAC CTC CGTGG 
TGACCCAGGCCTGCCCCAAGATCTCCTTCGAGCCC^^ 

CAACGACAAGAAGTTCAACGGCACCGGCCCCTGCACCAACGTGTCCACCGTGCAGTGCACCCACC^ 
TCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGGTGGTGATCC 

CCATCATCGTGCAGCTGAAGGAGCCCGTGGAGATCAACTGCACCCGCCCCAACAACAACACCCGCAAGGGCATCCACATCGG 
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CCCCGGCCGCGCCTTCTACGCC^CCGGCGACATCATCGGCGACATCC^ 

AACAACACCCTGAAGCAGATCGTGATCAAGCTGCGCAAGCAGTTCGGCAACAAG^ 

GCGACCCCGAGATCGTGATGCACTCCTTCAACTGCGGCGGCGAGTTCCT 

CTGGAACGGCACCGAGGAGCTGAACAACACCGAGGGCGACATCGTGACCCTGCCCTGCCGCATCAAGCAGATCA 
TGGCAGGAGGTGGGGAAGGCCATGTACGCCCCCCCCATCGCCGGCCAGATCCGCTGCTCCTCCAACATCACCGGCCTGCTGC 
TGACCCGCGACGGCGGCAACCAGTCCAACGTGACCGAGATCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTC 
CGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACC^ 

" GAGAAGCGCGCCGTGGGCATCGGCGCCGTGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCA 
CCCTGACCGTGCAGGCCCGCCAGCTGCTK3TCCGGCATCGTGCAGCAGCAGA 
GCACCTGCTGCAGCTGACCX3TGTGGGGCATCAAGCAGCTG<^ 

• CAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCGCCGTGCCCTGG 
CCCTGGACGAGATCTGGAACAACATGACCTGGATGGAGTGGGAGCGCX5AGATCAACAACTACACCGGCCTGATCTACAACCT 
GATCGAGGAGTCCC7VGAACCAGCAGGAGAAGAACGAGCAGGAGATCCTGGCCCTGGACAAGTGGGCCTCCCTGTGGAACTGG 
TTCGACATCTCCAAGTGGCTGTGGTACATCAAGATCTTCAT<^TGATCGTGGG 

CCGTGCTGTCGATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCTTCCAGACCCGCCTGCCCACCCAGCGCGGCCC 

CGACCGCCCCGAGGGCATCGAGGAGGAGGGCGGCGAGCGCGACCGCGA.CACCTCCATCCGCCTGGTGAACGGCTTCCTGGCC 

CTGATCTGGGACGACCTGCGCTCCCTGTGCCTGTTCATCTACCACCACCTGCGCGACCTGCTGCTGATCGCCGCCCGCATCG 

TGGAGCTGCTGGGCCGCCGCX5GCTGGGAGGCCCTGAAGTACTGGTGG 

CTCCGCCATCAACCTGATCGACACCATCGCCATCGCCGTGGCC 

TGCCGCGCCATCCGCAACATCCCCCGCCGCATCCGCCAGGGCGCCGAGAAGGCCCTGCAGTAA 
2003 CON_04_CPX Bnv 

MRVMGIQRNYPHLWEWGTLILGLVIICSASKNL^ 
IAIiKNVTENFJSTMWKNNMVEQMHEDI I SLWDEGLK^ 

EYALFYRXjDXVPXNDSANNNS INSEYMLINCNASTIKQACPKVTFEPIPIHYGAPAGFAIIjKCNDKNFTGI 
THGIKPWSTQLLLNGSLATEGWIRSKNFTDNTK^ 
CNISGNDWNETIiQKIVEELRKHFPNlCrilFAPSAGGDLEIT^ 

IKQIVSMWQEVGQAMYAPPIAGSINCSSDITGIILTIUX3GNNNTNNETFRPGGGDMRDN^ 

RRRWQREKRAVG I GAVFLGFLGAAG S TMGAAS I TLTVQARQLLSG X VQQQSNLiLRA I EAQQHLItRLTVWG I KQLQARVXiAXj 
ESYIiKDQQLIjGXWGCSGKljXCTTNVPWNSSWSNKSYinDXWDNMTWLQWDKEXNNYTQXXYEL 
WANLWNWFNISNWIiWYIKIFIMIVGGLIGIjRIIFAVIiSIVNR^ 
RLVNGFLPLIWDDLRNLCLFSYRHLRNLLL 
1 1 EAVQRACRA I RNI PRR I RQGLERALL $ 



2003 CON_04_CPX Env. seq.opt 

ATGCGCGTGATGGGCATCCAGCGCAACTACCCCCACCTGTGGGAGTGGGGCACCCTGATCCTGGGCCTGGTGATCATCTGCT 

CCGCCTCCAAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGCGCGACGCCGAGACCACCCCCTTCTGCGCCTC 

CGACGCC^UVGGCCTACGACAAGGAGGTGCACAACATCTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 

ATCGCCCTGAAGAACGTGACCGAGAACTTCAACATGTGGAAGAACAACATGGTGGAGCAGATGGACGAGGACATG 

TGTGGGACGAGGGCCreAAGCCCTGCGTGAAGCTGACCCCCCTGTC 

CTCGACGAAGACGAACTCCACCGAGGAGATCAAGAACTGCTCCTTCAACZATCACCACCGAGATCCGCGACA 

GAGTACGCCCTGTTCTACCGCCTGGACATCGTGCCCATCAACGACTCCGCCAACAACAACTCCATCAACTCCGAGTACATGC 

TGATGAACTGCAACGCCTCCACCATCAAGC^ 

CGCCGGCTTCGCCATCCTrGAAGTGCAACGACAAGAACTTCACCGGCCTGGGCCCCTGCACCAACGTGTCCTCCGTGCAGTGC 
ACCCACGGCATCAAGCCCGTGGTGTCCACCCAGC?TGCTGCTGAACGGCTCCCTGGCCACCGAGGGCGTGGTGATCCGCTCCA 
AGAACTTCACCGACAACACCAAGAACATCATCGTGCAGCTGGCCAAGGCCGT 

CACCCGCAAGTCCGTGCACATCGGCCCCGGCCAGACCTGGTACGCCACCGGCGAGATCATCGGCGACATCCGCCAGGCCCAC 
TGCAACATCTCCGGCAACGACTGGAACGAGACCCTGCAGAAGATCGTGGAGGAGCTGCGCAAGC 
TCATCTTCGCCCCCTCCGCCGGCGGCGACCTGGAGATCACC^^ 
CACCTCCX3AGCTGTTCAACTCCACCTACATGAACTCCACCAACTCCAC 

ATCAAGCAGATCGTGTCCATGTGGCAGGAGGTGGGCCAGGCCATGTACGCCCCCCCCATCGCCGGCTCCATCAACTGCTCCT 
CCGACATC7VCCGGCATCATCCTGACCCGCGACGGCGGGAACAACAACACCAACAACG 

CATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCGTGGGCGTGGCCCCCACCCGCGCC 
CGCCGCCGCGTGGTGC^GCGCGAGAAGCGCGCCGTGGGCATCGGCGCCGTGTTCCTGGGCTT 

CCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCA 

GCGCGCCATCGAGGCCCAGCAGCACCTGCTGCGCCTGACCGTGTGGGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCCTG 

GAGTCCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGC 

ACTCCTCCTGGTCCAACAAGTCCTACAACGACATCTGGGACAACATGACCTGGCTGCAGTGGGACAAGGAGATCAACAAC 
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CACCCAGAT CATCTACGAG CTG CTGG AGG AGTC CGAGAACCAGCAGG AG AAGAAC GAG CAGGAC CTGCTGGC CCTGG ACAAG 

TGGGCCAACCTGTGGAACTGGTTCAACATCTCCAACTGGCTGTGGTACATCAAGATCTTCATC^ 

TCGGCCTGCGCATCATCTTCGCCGTGCTOTCC^TC 

GATCCCCACCACCCAGCGCGGCCCCGACCGCCCCGAGGGCACCGAGGAGGAGGGCGGCGAGCAGGACCGCTCCCGCTCCATC 
CGCCTGGTGAACGGCTTCCTGCCCCTGATCTGGGACGACCTGCGCAACCTGTGCCTGTTCTCCTACCGCCACCTGCGCAACC 
TGCTGCTGATCGTGGCCCGCACCGTGGAGCTGCTGGGC^^ 
GTACTGGGGCCAGGAGCTGCGCAACTCCGCCATCAACCTGCTC 

ATCATCGAGGCCGTGCAGCGCGCCTGCCGCGCCATCCX3CAACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGC 
TGTAA 

7 

2 003 CON_0 6 _CPX Rnv 

MRVKGIQK^QHLWKWGTLIIXSLVIICSASNimWVTVYYG 

I AIjBNVTENFNMWKNHMVEQMHED IIS LWDESIiKPCVKLTPLCVTLNCTNVTKNNNTKIMGREE I KNCS FNVTTE I RDKKKK 

EYALFYRLDVVPIDDNNNSYRLINCNASTIKQACPKVSFEPIPIHYCAPAGFA 

VVSTQLLLNGSLAEEEII IKSENLTDNTK^ 

DWNNMLQNVTAKLKEIiFNKNITFNSSAGGDIiEITTHSFNCGGEFFYC^ 
AMYAPPIAGNITCTSNITGLLIiTRDGNNNDSETFRPGGGDMRD^^ 
AVFIX3FIX3TAGSTMGAASITLTVQWQLLSGIVQQQSNLLRAIEAQQHLLQL 
SGKLICPTNVPWNASWSNKTYNEIWDNMTWIEWDREINNYTQQIYSL^ 

YIKIFIMIVGGIjIGIiRIVTAVIiSIVNRVRQGYSPLSLQTLIPNPTGADRPGEIEEGGGEQGRTRSIRL 

LCIiFSYHRLRDFVLIAARTVETIiGHRGWEIIjKY^GNLVCrYW 

RRIRQGFERALLS 

2003 CON_06_CPX Env.eeq.opt 

ATGCGCGTGAAGGGCATCCAGAAGAACTGGCAGCACCTGTGGAAGTGGGGCACCCTGATCCTGGGCCTGGTGAT 

CCGCCTCCAACAACATGTGGGTGACCGTGTACTACGGCGTGCCCGCOTGGGAGGACGCCG 

CGACGCCAAGGCCTACTCCGCCGAGAAGC^C^^CGTGTGGGCC^ 

ATCGCCCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACCACATGGTGGAGCAGATGCACGA 

TGTGGGACGAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTC 

CAAGACCAAGATCATGGGCCGCGAGGAGATCAAGAACTGCTCCTT^^ 

GAGTACGCCCTGTTCTACCGCCTGGACGTGGTGCCCATCGACGACAACAACAACTCCTACCGCCTGATCAACTGCA^ 

CC^CCATC^^GCAGGCCTGCCCCAAGGTGTCCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCr^ 

GAAGTGCCGCGACAAGAACTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCA 

GTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCAAGTCCGAGAACCTGACCGACAACA 
CCAAGACCATCATCGTGCAGCTGAACAAGTCCGTGGAGATCCGCTOCACCCGCCCCAACAACAACACCCX3CA 
CITCGGCCCCGGCGAGGCCTTCTACGCCACCGGCGAGATCATCGGCGACATCCGCCAGGCCCACTGCAACG 
GACTGGAACAACLATGCTGCAGAACGTGACCGCCAAGCTGAAGGAGCTGTTCAACAAGAACA 

GCGGCGACCTGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCTCCCAGCTGTTCAACTC 

CACCCGCCCCAACGAGACCAACACCATCACCCTGCCCTGCAAGATCAAGCAG 

GCCATGTACGCCCCCCCCATCGCCGGG^CATCACC^ 

ACAACGACTCCGAGACCTTCCGCCCCGGCGGCGGCGACATGCGCGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGT 
GAAGATCAAGCCCCTGGGCATCGCCCCCACCCGCGCCCGCCGCCGCGTGGTGGGCCGCGAGAAGCGCGCCGTGGGCCTGGGC 
GCCGTGTTCCTGGGCTTCCTGGGCACCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGTGCGCCAGC 
TGCTGTCCGGGATCGTGC^GCAGCAGTCCAACCTC 

GGGCATCAAGCAGCTGCAGGCCCGCGTGCTGGCCGTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGC 
TCCGGCAAGCTGATCTGCCCC^CCAACGTGCCCTGGAACGCCTCCTGGT^ 

TGACCTGGATCGAGTGGGACCGCGAGATCAACAACTACACCCAGCAGATCTACTCCCTGATCGAGGAGTCCCAGAACCAGCA 
GGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTGGGCCTCCCTGTGGTCCTGGTTCGACATCTCCAACTGGCTGTGG 
TACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCTGTCCATCGTGAACCGCG 
TGOGCCAGGGCTACTCCCCCCTGTCCCTGCAGACCCTGATCCC^ 

GGGCGGCGGCGAGCAGGGCCGCACCCGCTCCATCCGCCTGGTGAACGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCC 
CTGTGCCTGTTC?TCCTACCACCGCCTGCGCGACTTCGTGCTGATCGCCGCCCX5CACCGTGGAGACCCTGGGCCACCGCGGCT 
GGGAGATCCTGAAGTACCTGGGCAACCTGGTGTGCTACTGGGGCCZAGGAGCTGAAGAACTCCGCCATCTCCCTGCTGGACAC 
CACCGCCATCGCCGTGGCCAACTGGACCGACCGCGTGATCGAGGTGGTGCAGCGCGTGTTCCGCGCCTTCCTGAACATCCCC 
CGCCGCATCCGCCAGGG CTTCGAGCGCGCCCTGCTGTAA 

I 2003 COM 08 BC Bnv 
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MRVRGTRRNYQQVWIWGVLGFWMLMICNVEGNIjWVTV^ 
IVME^rVTENFN^IVmNDMVNQ^^ 

DRKICIVYAIiFYiaiDIVPLNDENSGKNSSEYYRIilNCNTSAITO 

STVQCTHGIKPWSTQLIiliNGSIiAEREII IRSENI/TNNVKTII VHLNQSVEIVCTRPNNNTRKS IRIGPGQTFYATGDI IGD 
IRQAHCNISKDKWYETLQRVSKKLAEHFPNKTIKFASSSGGDLEITTHSra 

PCR I KQI INMWQEVGRAMYAPP I EGN ITCKSN I TGLLLVRDGGRTESNNTE I FRPGGGDMRNNWRNELYKYKWEIKPLGVA 

PTAAKRRVV^REKRAVGLGAVFLGFLGAAGSTMGAAS ITLTVQARQLLSGI VQQQSNLLRAI EAQQHMLQLTVWG I KQLQTR 
" VIAIERYIjKDQQLIjGIWGCSGKIjICTTAVPWNSSWSNKSQQEIWDNMTWMQWDKEISNYTOT 

ALDSWKNLWSWFDITNWIiWYIKIFIMIVGGLIGIiRIIFAVLSIVNRVRQGYSPLSFQILTPNPGGPGRI^ 

rSIRLVNGFIiALAWDDIjRNLCIjFSYHRIjRDFIIjIjTARGV^ 
• AIAVAEGTDRIINIVQGICRAIHNIPRRIRQGFEAALQ$ 

2003 CON_08_BC Env Beq.opt 

ATGCGCGTGCGCGGCACCCGCCGCAAC^ACCAGCAGTGGTGGATCTGGGGCGTGCTGGGCTTCTGGATGCTGATGATCTGC^ 
ACGTGGAGGGCAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCAAGACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACGAGACCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 
ATCGTGATGGAGAACGTGACCGAGAACTTCAACATGTGGAACAACX5ACATGGTGAACCAGATG 

TGTGGGACCAGTCCCTG7^GCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGGAGTGCACCAACGTGTCCTCCAACGG 
CAACGGCACCTACAACGAGACCTACAACGAGTCCGTGAAGGAGATCAAGAACTOCTCCTTCAACG 

GACCGCAAGAAGACCGTGTACGCCCT^TTCTACCGCCTGGACATCGTGCCCCTGAACGACGAGAACTCCGGCAAGAACTCCT 
CCGAGTACTACCGCCTGATC^^CTGCAACACCrrCCGCC^T^ 

CCACTACTGCACCCCCGCCGGCTACGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCACCGGCCAGTGCCACAACGTG 

TCCACCGTGCAGTGCACCCACGGCATCAAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGCGCGAGA 

TCATCATCCGCTCCGAGAACCTGACCAAGAACGTGAAGACCATCATCGTGCACCTGAACCAGTCCGTGGAGATCGT 

CCGCCCC^^CAACAACACCCGC^U^GTCCATCCGCM 

ATCCGCCAGGCCGACTGCAACATCTCCAAGGAGAAGTGGTACGAGAC 

TCCCCAACAAGACCATCAAGTTCGCCTCCTC^ 

GTTCTTCTACTGCAACACCTCCGGCCTGTTCAACGGCACCTACATGAACGGCACGAACAACTCCTCCTCCA 
CCCTGCCGCATCAAGCAGATCATCAACATGTGGGAGGAGGTGGGCCGCGCCAT 

CCTGCAAGTCCAACATCACCGGCCTGCTGCTGGTGCGCGACGGCGGCCGCACCGAGTCCAACAACACCGAGATCTTCCGCCC 

CGGCGGCGGCGACATGCGCAACAACTGGCGCAACGAGCTGTACAAGTACAAGGTGGTGGAGATCAAGCCCCTGGG 

CCCACC^CCGCCAAGCGCCGCGTGGTGGAGCGCGAGAAGCGCGCCGTGGGCCTGGGCGCCGTGTTCCTGGGCTTCCTGGGCG 

CCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGACCGTGCAGGCCCGCCAGCra 

GTCCAACCTGCTGCGCGCCATCGAGGCCCAGCAGCACATGCTGCAGCTC 

GTGCTGGCCATCGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCG 
CCGTGCCCTGGAACTCCTCCTGGTCCAACAAGTCCCAGC^GGAGAT 

GATCTCCAACTACACCAACACCATCTACCGCCTGCTGGAGGACTCCCAGAACCAGCAGGAGCGCAACGAGAAGGACCTGCTG 
GCCCTGGACTCCTGGAAGAACCTOTGGTCCTGGTTCGACATCACCAACTGGCT 

TGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCGTGCTGTCCATCGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTC 
CTTCGAGATCCTGACCCCCAACCCCGGCGGCCCCGGCCGCCTGGGCCGCATCGAGGAGGAGGGCGGCGAGCAGGACAAGACC 
CGCTCC^TCCGCCTGGTGAACGGCTTCCTGGCCCTGGCCTGGGACGACCrTO 

TGCGCGACTTCATCCTGCTGACCGCCCGCGGCGTGGAGCTGCTGGGCCGCAACTCCCTGCGCGGCCTGCAGCGCGGCTGGGA 
GGCCCTGAAGTACCTGGGCTCCCTGGTGCAGTACTGGGGCCTGGAGCTGAAGAAGTCCACCATCTCCCTC 
GCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCAACATCGTGCAGGGCATCTGCCGCGCCATCCACAACATCCCCCGCC 
GCATCCGCCAGGGCTTCGAGGCCGCCCTGCAGTAA 

2003 CON_10_CD Env 
A MRVMGI QRNCQQWW I WG I LGFWMLMI CNATGNLWVTVYYGVPVWKETTTTLFCASDAKAYKA 
/ ' IVLENVTENFNMWKNGMVDQMHEDIISLWDTO 

EYAIiFYKL^VVQIDGSNTSYRIjINCNTSAITQACPKVTFEPIPIHYCAPAGFAILK^ 

VVSTQLLLNGSIiAEEEIIIRSENIjTDNAKTI IVQLNES IGNIRQAYCNISGT 

EWNKTIiQQVAKKLGDIjLNKTTI I FKPSSGGDPEITTHTFNCGGEFFYOrrSKLFNSSWTSiraTGNTSTITLPCRIKQIINMW 

QGVGKAIYAPPIAGLINCSSNITGLDIiTRDGGAOTJSETFRPGGGDMRDNTOSELYKYKVVKIEP 

AIGI/3AVFLGFI^AAGSTMGAASiyri/rVQARQL^ 

GIWGCSGKHICTTNVPWNSSWSNKSLEEIWDNMTWMEVre^ 

TNWLWYIKIFIMIVGGIilGLRIVFAVLSLVNRVRQGYSPIiSFQTLLPAPRGPDRPEGIEEEGGEQGRGRSIRIiWG 
DDLRNXiCLFS YHRLRDL I LI ATR I VELLGRRGWEAI KYLWNLIjQYW I QELKNS AI S LLDTTAI AVAEGTDRAIE I VQRAVRA 
VTiN I PTRIRQGLERAL1L1 $ 
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2003 CON_10_CD Bnv. seq.opt 

" ATGCGGGTGATGGGCATCGAGCGCAACTGC CAGCAGTGGTGGATCTGGGGCATCCTrGGGCTTCTGGATGCTG ATGATCTGCA 
ACGCCACCGGGJ^CCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGACCACCACCA 
CGACGCCAAGGCCTACAAGGCCGAGGCCC^Cy^CATCTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCC 
ATCGTGCTGGAGAACGTGACCGAGAACTTCAACATGTGGAAGAACGGCATGGTGGACCAGATC 
TGTGGGACCAGGGCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCC 

CTCCGCCACCAACACCGTGGTGGCCGGCATGAAGAACTGCTCCTTCAACATCACCACCGAGATCCGCGACAAG 

* GAGTACGCCCTGTTCTACAAGCrnMACGTGGTGCAGATCGACXKSCTCCAACACCTCCTA 
CCGCCATCACCCAGGCCTGCCCC^^GGTGACCTTCGAGCCCATC 
GAAGTG(^VACGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGT 

* GTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCGAGAACCTGACCGACAACG 
CCAAGAC CATGATCGTG C AGCTGAACG AGTC CGTGAC C ATCAACTG CAC C CGCCCCAAC AAC AACAC C CGCAAGTCCATCCG 
CATCGGCCCCGGCCAGACCTTCTACGCCACCGGCGACATCATCGGCAACATCCG^ 
GAGTGGAACAAGACCCTGCAGCAGGTGGCCAAGAAGCTGGGCGACCTGCTGAAGAAGA^ 
CCGGCGGCGACCCCGAGATCACGACCCACACCTTCAACTGCGGCGGCGAGTTCTTCT^ 
CTCCTCCTGGACCTCCAACAACACCGGCAACACCTCCACCATCA.CCCTGCCCTG 
CAGGGCGTGGGCSU^GGCC^TCTACGCCCCCCCC^TCGCCGGCCTGAT 
CCCGCGACGGCGGCGCCAAC^^CTCCGAGACCTTCCGCCCCGGCGGCGGC^ 
CAAGTACAAGGT^GTGAAGATCGAGCCCCTGGGCCTGGCCCCC^CCAAGGCGAAG^ 
GCGATCGGCCTGGGCGCCGTGTTCCrroGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTC 
TGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCA.GCA.GAACAACCTGCTGCG 
GCAGCTGACCGTGTGGGGCATC^^GCAGCTGCAGGCCCGCGTC 
GGCATCTGGGGCTGCTCCGGCAAGCACATCTGCACCACCAACGTGCC 
AGATCTGGGACAACATGACCTGGATGGAGTGGGAGCGCGAGATCGAC^ 

GTCC CAGAACCAGCAGGAGAAGAACGAGCAGGAGCTG CTG CAG CTGGACAAGTGGGCC TCCCTGTGGAACTGGTTCTC CATC 

ACCAA.CTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCGTGTTCGCCGTGCT 

CCCTGGTGAACCGCGTGCGCCAGGGCTACTCCCCCCTGTCCCT 

CGAGGGCATCGAGGAGGAGGGCGGCGAGCAGGGCCGCGGCCGCTCCATCCGCCTGGTGAACGGCTTCTCCGCCCTGATCTGG 
GACGACCTGCGCAACCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACCTGATCCTGATCGCCACCCGCATCGTGGAGCTGC 
TGGGCCX3CCGCGGCTGGGAGGCCATCAAGTACCTGTGGAACCTGCTGCAGTACTGGATCCAGGAGC 

CTCCCTGCTGGACACCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCGCCATCGAGATCGTGCAGCGCGCCGTGCGCGCC 
GTGCTGAACATCCCCACCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 



2003 CON_ll_CPX Bnv 

MRVKET^RlWHNIiWRWGLMI FGMLMI CNATENIjWVTVYYGVPV^ 

I PLENVTENFNMWKNNMVEQMHED 1 1 SLWDE SLKPCVKLTPLCVTIjNCTDVKNATNTTVEAAE I KNCS FNITTE I KDKKKKE 
YALFYKIJDVVPINDNNNSIYRIjINCNVSTVKQACPKVTFEPIPIHY 

WSTQLLIiNGSLAEGEVRIRSENFTWNAKTIIVQLNSSWINCTRPNNNTRKSIHIGPGQAFYAT^ 

EWNNTLQQVAKQLRENFMKTIIFNNPSGGDLEITTHSFNCGGEFFYCNT 

WQRVGQAMYAPPIQGKIRCNSNITGLIjLTRDGGNNN 

KRAVGIGAVLLGFLGAAGSTMGAASITLTVQARQLLSGIVQQQSNLLKAIEAQQHIi 

LIjGIWGCSGKLICTTNVPWNFSWSNKSYDEIWDNMTWIEW^^ 

DISNWLWYIKIFIMIVGGLIGLRIIFAVLSIVNRCRQGYSPL^ 

AWDDIJWLCLFSYHRLRDFIIjIAARIVETIjGRRGW^ 

RAILHIPRRIRQGFERALL,$ 

2003 CON_ll_CPX Bnv. seq. opt 

ATGCXSCGTGAAGGAGACCCAGCGCAACTGGCAC^^CCrrGTGGCGCTGGGGCCTGATG 

ACGCC^CCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGACGCCGACACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACTCCACCGAGAAGCACAACGTGTGGGCCACCCAC^ 

AT CCCC CTGGAGAACGTG AC CGAGAACTTCAACATGTGGAAGAACAACATGGTGG AGCAGATG CACG AGG ACATCATCTC C C 
TGTGGGACGAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGTGAAGAACGCCAC 
CAACACCACCGTGGAGGCCGCCGAGATCAAGAACTGCTCCTTCAACATCACCACCGAGATCAAGGACAAGAAGAAG 
TACGCCCTGTTCTACAAGCTGGACGTGGT^CCCATCAACGACAACAACAACTCCATCTACCGCCT^ 

CCACCGTGAAGCAGGCCTGCCCCAAGGTGACCTTCGAGCCCATCCCCATCCACTACTGCGCCCCCGCCGGCTTCGCCATCCT 

GAAGTGCAACGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACCCA.CGGCA 

GTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGGCGAGGTGCGCATCCGCTCCGAGAACT^ 

CCAAGACC7VTC^TCGTGCAGCTGAACTCCTCCGTGCGCA.TCAACTGCACCCGCCCCAA 

CATCGGCCCCGGCCAGGCCT^CTACGCCACCGGCXSACATCATCGGCGACATCCGCCAGGCCCACT 




17 



GAGTGGAACAACACCCTGCAGCAGK3TCK3CCAA 

GCGGCGACCTGGAGATCACCACCCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCTCCCGCCTGTTCAACTC 

CACCTGGAACAACGAGACCCGCAACGACACCAAGCAGATGGAC 

TGGCAGCGCGTGGGCCAGGCCATGTAC^CCCCCCCCATCCAGGGCAAGATC^ 

TGACCCGCGACGGCGGCAACAACAACACCAACGAGACCTTCCGCCCCACCGGCGGCGACATGCGCGACAACTGG 

GCTGTACAAGTACAAGGTGGTGGAGATCAAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGfCCGCGTGGTGGAGCGCGAG 

AAGCGCGCCGTGGGCATCGGCGCCGTGCTGCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCC 

TGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGAAGGCCATCGA 

CCTOCTGAAGCTGACCGTGTGGGGCATGAAGCAG 

CTGCTGGGCATCTGGGGCTOCTCCGGCAAGCTGATCTGCACGACCAA 

ACGACGAGATCTGGGAGAACATGACCTGGATCGAGTGGGAGCGCGAGATCAAGAA 

GGAGGAGTCCCAGAACCAGCAGGAGAAGAACGAGCAGGACCTGCTGGCCCTGGACAAGTG 

GACATCTCCAACTGGCTGTGGTACATCAAGATCTTCATCATGATCGTGGGCGGCCTGATCGGCCTGCGCATCATCTTCGCCG 
TGCTGTCCATCGTGAACCGCTGCCGCCAGGGCTACTCCCCCCT 

CCGCCCCGGCGGCATCGAGGAGGGCGGCGGCGAGCAGGACCGCACCCGCTCCATCCGCCTGGTGTCCGGCTTCCTGGCCCTG 

AGACCCTGGGCCGCCGCGGCTGGGAGATCCTGAAGTACCTGGGCAACCTGGCCCAGTACTGGGGCCAGGAGCTGAAGAACTC 
CGCCATCTCCCTGCTGAACGCCACCGCCATCGCCGTGGCCGAGGGCACCGACCGCATCATCGAGGTGGTGCACCGCGTGCTG 
CGCGCCATCCTGCACATCCCCCGCCGCATCCGCCAGGGCTTCGAGCGCGCCCTGCTGTAA 

?l 

2003 CON_12_BF Env 

MRVRGMQRNWQHIjGKWGIjIjFLjGIIjI I CNATENIjWVTVYYGVPVWKEATTTLFCASDAKS YERKVHNVWATHAC^ 
VDDENVTENFDMWKNNMVEQMHTDII SLWDQSLKPCVKLTPLCVTIjNCTDANATANATK^ 

KQMKVQALFYRIiD IVPISDNNSNEYRXiINCNTSTITQACPKVSWDPIPIHYCAPAGYAIIiKCNDKKFNGTGPCKNVSTV 
HGIKPWSTQLIiliNGSIiAEEEIIIRSQNISDNAKTIIVHIjNESVQINCTRPNl^ 
NTVSGTQWNICITiEQVKKKIjRSYFNTTIKFNSSSGGDPEITMHSFNCRGEFFYC^ 
VGRAMYAAPIAGNITCTSNITGLLLTRDGGHNETNKTETF 

RAVG I GALFLGFIiGAAGS TMGAAS I TLTVQARQLIjSG I VQQQSNLIiRAI EAQQHLLQIjTWJ G I KQLQARVIiAVERYLKDQQIj 

LGLWGCSGKLICTTNVPWNSSWSimSQEEIWENMTWMEWEKE^ 

ISNWI,WYIRIFIMIVGGLIGLRIVFAVIiSIVNRVRKGY 

WDDLRSIjCLFS YHRIiRDIjLLIVTRIVEIjIjGRRGWEVIjKYWWNIiLQYWSQELKNSAI S LiIiNTTAI WAEGTDRVIEALQRVGR 
AILNI PRRIRQGLERALL$ 

2003 CON_12_BF Env.seq.opt 

ATGCGCGTGCGCGGCATGCAGCGCAACTGGCAGCACCTGGGCAAGTGGGGCCTGCTGTTCCTC 
ACGCCACCGAGAACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCA 

CGACGCCAAGTCCTACGAGCGCGAGGTGCACAACGTGTGGGCCACCCACGCCTGCGTGCCCACCGACCCCAACCCCCAGGAG 

GTGGACCTGGAGAACGTGACCGAGAACTTCGACATGTGGAAGAACAACATGGTGGAGCAGATGCACACCGACATCATCTCCC 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGAAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACGCCAACGCCACCGC 

CAACGCCACCAAGGAGCACCCCGAGGGCCGCGCCGGCGCCATCCAGAACTGCTCCTTCAACATGACCACCGAGGTGCGCGAC 

AAGCAGATGAAGGTGCAGGCCCTGTTCTACCGCCTGGACATCGTGCCCATCTCCGACAACAACTCCAACGAG 

TCAACTGCAACACCTCCACCATCACCCAGGCCTGCCCCAAGGTGTCCTGGGACCCCATCCCCATCCACTACTGCGCCCCCGC 

CGGCTACGCCATCCTGAAGTGCAACGACAAGAAGTTCAACGGCACCGGCCCCTGCAAGAACGTGTCCACCGTGCAGTGCACC 

CACGGCATCAAGCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCATCATCCGCTCCCAGA 

ACATCTCCGACAACGCCAAGACCATCATCGTGCACCTGAACGAGTCCGTGCAGATCAACTGCACCCGCCCCAACAACAACAC 

CCGC^UVGTCCATCCACATC^GCCCCGGCCGCGCCra 

AACGTGTCCGGCACCCAGTGGAACAAGACCCTGGAGCAGGTGAAGAAGAAGCTGCGCTCCTACTTCAACACCACCAT 

TCAACTCCTCCTCCGGCGGCGACCCCGAGATCACCATGCACTCCTTCAACTG 

GAAGCTGTTGAACGACACCGTGTCCAACGACACCATCATCCTGCCC 

GTGGGCCGCGCC^TGTACGCCGCCCCCATCGCCGGCAACATCACCTGCACCTCCAACATCACCGGCCTGCTGCTGACCCGCG 

ACGGCGGCCACAACGAGACCAACAAGACCGAGACCTTCCGCCCCGGCGGCGGCAACATGAAGGACAACTGGCGCTCCGAGCT 

GTACAAGTACAAGGTGGTGGAGATCGAGCCCCTGGGCGTGGCCCCCACCCGCGCCAAGCGCCAGGTGGTGAAGCGCGAGAAG 

CGCGCCGTGGGCATCGGCGCCCTGTTCCTGGGCTTCCTGGGCGCCGCCGGCTCCACCATGGGCGCCGCCTCCATCACCCTGA 

CCGTGCAGGCCCGCCAGCTGCTGTCCGGCATCGTGCAGCAGCAGTCCAACCTGCTGCGCGCCATCGAGGCCCAG 

GCTGCAGCTGACCGTGTGGGGCATCTVAGCAGCTGCAGGCCC^ 

CTGGGCCTGTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCAACGTGCCCTGGAACT 

AGGAGATCTGGGAGAACATGACCTGGATGGAGTGGGAGAAGGAGATCAACAACTACTCCAACGAGATCTACCGCCTGATCGA 
GG AGTC C CAG AAC C AG CAGG AGAAG AACG AG CAGG AG C TG CTGGC C CTGGACAAG TGGGC CT C C C TGTGGAAC TGGTTC G AC 
ATCTCCAACTGGCTGTGGTACATCCGC^TCTTCATCATGATCGTGGGCGGCCTGATCGG 
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TGTCCATCGTGAACCGCGTGCGCy^AGGGCTACTC 

CCCCGAGGGCATCGAGGAGGGCGGCGGCGAGCAGGGCAAGGACCGCTCCGTGCGCCTGGTGAACGGCTTCCTGGCCCTGATC 
TGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACCTGCTGCTGATCGTGACCCGCATCGTGGAGC 
TGCTGGGCCGCCGCGGCTGGGAGGTGCTGAAGTACTGGTGGAACCTGCTGCAGTACTGGTCCCAGGAGCTGAAGAACTCCGC 
CATCTCCCTGCTGAACACC^CCXSCCATCGTGGT^ 

GCCATCCTGAACATCCCCCGCCGCATCCGCCAGGGCCTGGAGCGCGCCCTGCTGTAA 
A 2003 CON_14_BG Knv 

HT MKAKGTQRNWQSLWKWGTLIIiGliVI ICSASNDLWVTVYYGVPWKEATTTLFCASDAKAYDA 
' VALENVTENFNMWENNMVDQMQED IIS LWDQSLtKPCVE LTPLCVTLNCTD FNNTTNNTTNTRNDGEGE I KNCS FN ITTS LRD 
• KI KKE YALF YNLDWQMDNDNS S YRLTS CNTS 1 1 TQAC PKVS FTP I P IHYCAPAGFVT LiKCNNKTFNGTG PCTNVSTVQCTH 
GIRPWSTQIiLLNGSIJ^EEIVIRSKNFTDNAierilV^ 

ISKTKWNNTIiGQIVKKLREQFMNKTIVFQRSSGGDPEIVMHSFNCGGEFFYCNTTQL 
PCRIKQIVNMWQKVGKAMYAPPISGQIRCSSNITGIiLLIRDGGSmre^ 

AKRRWQRE KRAVG I GALLFGFLGAAGSTMGAASMTLTVQARQLIjS G I VQQQNNIjLRAI EAQQHMLQLiTVWG I KQLQARVLA 

VERYIjKDQQLI^IWGCSGKLICTTTVPWNASWSNKS^ 

KWASLWNWFNITNWLWYIKIFIMIIGGLIGIaR^ 

RliVSGFIiAIAWDDLRSLCIiFSYHRLRDFIIilAARTVELIiGRSSLKGLRIiGWEGLKY^ 
VANWTDRAI EWQRVGRAVLNI PVRIRQGL.ERAL.L, $ 

") 2003 CON_14_BG Env.seq.opt 

S ATGAAGGCCAAGGGCACCCAGCGCAACTGGCAGTCCCTGTGGAAGTGGGGCACCCTGATCCTGGGCCTGGTGATCATCTGCT 
CCGCCTCCAACGACCTGTGGGTGACCGTGTACTACGGCGTGCCCGTGTGGAAGGAGGCCACCACCACCCTGTTCTGCGCCTC 
CGACGCCAAGGCCTACGACGCCGAGGTGCACAACGTGTGGGCCACCCACGCC 
GTGGCCCTGGAGAACGTGACCX3AGAACTTCAACATGTGGGA 

TGTGGGACCAGTCCCTGAAGCCCTGCGTGGAGCTGACCCCCCTGTGCGTGACCCTGAACTGCACCGACTTCAACAACACCAC 

CAACAACACCACCAACACCCGCAACGACGGCGAGGGCGAGATCAAGAACTGCTCCTTC^ 

AAGATCAAGAAGGAGTACGCCCTGTTCTAGAACCTGGACGTGGTGCAGATGGACAAC^^ 

CCTGCAACACCTCC^TCATCACCCAGGCCTGCCCCaUVGGTGTCCTTC^ 

CTTCGTGATCCTGAAGTGCAACAAGAAGACCTTTCAACGGCA^ 

GGGATCCGCCCCGTGGTGTCCACCCAGCTGCTGCTGAACGGCTCCCTGGCCGAGGAGGAGATCGTGATCCGCTCGAAGAACT 
TCACCGACAACGCCAAGACCATGATCGTGCAGCTGAAGGACCCCATCGAGATCAACT 
GAAGCGGATCACCATGGGCCCCGGCCGCGTGCTGTACACCAC 
ATCTCCAAGACCAAGTGGAACAACACCCTGGGCCAGATCGTGAAGAAGCTGCG 

TCCAGCGCTCCTCCGGCGGCGACCCCGAGATCGTGATGCACTCCTTCAACTGCGGCGGCGAGTTCTTCTACTGCAACACCAC 
CCAGCTCTTCy^CTCCACCIXXSCGCTCCy^CTCCACCTGGAACGA 

CCCTGCCG CATCAAGCAGATCGTGAACATGTGGCAGAAGGTGGGGAAGGCCATGTACGCCCCCC CCATCTCCGGCCAGATCC 
GCTGCTCCTCCAACATCACCGGCCTGCTX3CTGATCCGCG 

CAACATGAAGGACAACTGGCGCTCCGAGCTGTACAAGTACAAGGTGGTGAAGATCGAGCCCCTGGGCGTGGCCCCCACCCGC 
GCCAAGCGCCGCGTGGTGCAGCGCGAGAAGCGCGCCGTGGGCATCGGCGCCCTGCTGTTCGGCTTCCTGGGCGCCGCCGGCT 
CCACCATGGGCGCCGCCTCCATGACCCTGACCGTGCAGGCCCGCCAGCTGCTGTCCGGCATC 
GCTGCGCGCCATCGAGGCCCAGCAGCACATGCTGCAGCTGACCGTGTGGGGCATCAAGCAGCTGC^ 

GTGGAGCGCTACCTGAAGGACCAGCAGCTGCTGGGCATCTGGGGCTGCTCCGGCAAGCTGATCTGCACCACCACCGTGCCCT 
GGAACGCCTCCTGGTCCAACAAGTCCCTGGACGACATCTGGAACAAGATGACCTGGAT^ 

CTACACCGGCCTGATCTACACCCTGATCGAGCAGTCCCAGAACCAGCAGGAGCGGAACGAGCAGGAGCTGCTGG 
AAGTGGGCCTCCCTGTGGAACTGGTTCAACATCACCAACTGGCTGTGGTACATGAAGATCTTC 

TGATCGGCCTGCGCATCGTGTTCGCCGTGCTGTCCATCATCAACCGCGTGCGCAAGGGCTACTCCCCCCTGTCCTTCCAGAC 
CCTGACCCACCACCAGCGCGAGCCCGACCGCCCCGGCCGCATCGAGGAGGAGGGCGGCGAGCAGGACAAGGACCGCTCCATC 
CGCCTGGTGTCCGGCTTCCTGGCCCTGGCCTGGGACGACCTGCGCTCCCTGTGCCTGTTCTCCTACCACCGCCTGCGCGACT 
TCATCCTGATCGCCGCCCGCACCGTGGAGCTGCTGGGCCGCTCCTCCCTGAAGGGCCTGCGCCTGGGCTGGGAGGGCCTGAA 
GTACCTGTGGAACCTGCTGCTGTACTGGGGCCGCGAGCTGAAGAACTCCGCCATCAACCTGCTGGACACCGTGGCCATCGCC 
GTGGCCAACTGGACCGACCGCGCCATCGAGGTGGTGCAGCGCGTGGGCCGCGCCGTGCTGAACATCCCCGTGCGCATCCGCC 
AGGGCCTGGAGCGCGCCCTGCTGTAA 
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Centralized HIV-1 gag/nef/pol Protein and the Codon- optimized Gene 
~? Sequences 



1. 2003_CON_S gag. PEP 

MGARASVLSGGKIJDAWEKIRLRPGGKKKYRIjKHL^ 

ATLYCVHQR I EVKDTKEALDKI EE EQNKS KQKTQQAAADTGNS S KVSQNYP I VQNLiQGQMVHQAI S PRTLNAWVKWEEKAF 
SPEVIPMFSALSEGATPQDIjNTMIiNTVGGHQAAMQMLKDTINEEAAEVroRIjHPVHAG 
IGWMTSNPPIPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFF 
DCKTILKALGPGATLEEMMTACQGVGGPSHKARVLAEAMSQVTNTTIMM 

GCWKCGKEGHQMKDCTERQANFIXSKIWPSNKGRPGNFLQSRPEPTAPPAESFGFGEEITPSPKQEPKDK^ 
NDPLSQ$ 

2003_CON_S gag . OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 

AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTGCTGGAGACCTCCGA 

GGGCTGCCAGCAGATCATCGAGCAGCTGCAGCCCGCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAAC^ 

GCCAC CCTGTACTG CGTGCACCAG CGC ATCGAGGTGAAGGACACCAAGGAGG CCCTGGACAAGATCGAGGAGGAG CAGAACA 

AGTCCAAGCAGAAG ACCCAGCAGGCCGCCGCCGACAC CGG CAACTCCTCCAAGGTGTCCCAGAACTACCCCATCGTGCAGAA 

CCTGCAGGGCCAGATGGTGCACCAGGCC^TCTCCCCCCGCACCCTOAACGCCTGGGTG 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCAC 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTG 

CGCCGGCCCC^TCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCX3ACATCGCCGGCACCACCTCCACCCTGCAG^ 

ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGT^ 

CAAGACCCTGCGCGCCGAGGAGGCCACCCAGGACGTGAAGAACTGGATGACCGACACCCTGCTGGTGCA 

GACTGGAAGACC^TCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCG 

CCTTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGACCAACACCACCATCATGATGCAGCGCGGCAACTTCAA 
GGGCCAGJ^GCGCATC^TCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGC^ 
GGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTIX3CACCGAGCGCCAG 
CCTCCAACAAGGGCCGCCCCGGCAACTTCCTGCAGTCC^ 

CGAGGAGATCACCCCCTCCCCCAAGCAGGAGCCCAAGGACAAGGAGCTGTACCCCCTGGCCTCCCTGAAGTCCCTGTTCGGC 
AACGACCCCCTGTCCCAGTAA 



2. 2003_M.6ROTIP.anc gag.PBP 

MGARASVLSGGIOjDAWEKIRLRPGGKKKYRLKHLVWASRELERFAL^ 

ATLYCVHQRIEVKDTKEALDKIEEEQNKSQQKTQQAAADKGDSSQVSQNYP I VQNLQGQMVHQAI S PRTLNAWVKWEEKAF 
•SPEVIPMFSAIjSEGATPQDIiNTMLNTVGGHQAAMQMIjKIJTINEEAAEWDRIjHPVHAGPIP 
IGWMTSNPPIPVGEIYKRWIILGLNKIVROTSPVSILDIRQGPKEPFRDYVDR 
DCKTI_KAI*GPGATIjEEMMTACQGVGGPGHKARVIAEAMSQVTNANIM^ 

GCWKCGKEGHQMKDCTERQANFLGKIWPSNKGRPGNFLQSRPEPTAPPAESFGFGEEITPSPKQEPKDKELYPIASIjKSIjFG 
SDPLSQ$ 

2 003_M. GROUP . anc gag - OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGG^ 

AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTGCTGGAGACCGCCGA 
GGGCTGCGAGCAGATCATGGGCCAGCTGCAGCCCGCCCTGC^^ 

GCCACCCTGTACTGCGTGCACCAGCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACA 
AGTCCCAGCAGAAGACCCAGCAGGCCGCCGCCGACAAGGGCGACTCCTCCCAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CCTGCAGGGCCAGATGGTGCACCAGGCGATCTCCCCCCGCACCCT 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 

CGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAG 

ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT 

CAAGACCCTGCGCGCCGAGCAGGCGACCCAGGACGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCC 

GACTGCAAGACCATCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGACCAACGC 

GGGCCCCCGCCGCATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAG 
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(X3CTGCTGGAAGTGCGGCAAGGAGCX3CCACC^ 
CCTCCAACAAGGGCCGCCCCGGCAACTTCCTGCAGTCCCGC^ 

CGAGGAGATCACCCCCTCCCCC?U\GCAGGAGCCCAAGGACAAGGAGCTGTACCCCCTGGCCTCCCrrGAAGTCCCTGTTCGGC 
TCCGACCCCCTGTCCCAGTAA 



3. 2003_CON_A1 gag.PBP 

MGARASVLSGGKIiDAWEKIRIjRPGGKKKYRLKHIjVWASREL 

ATL1YCVHQRIDVKDTKEAL1DKI EE IQNKS KQKTQQAAADTGNS S KVS QNYPI VQNAQGQMVHQS L S PRTIiNAWVKVT EEKAF 
SPEVIPMFSAIiSEGATPQDLNMMLNIVGGHQAAMQMIiKDTINEEAAEWDRIiH 
' iqwMTGNPPIPVGDIYKRWIIIjGIjNKIVROT 

DCKS I LRALGPGATIjEEMMTACQGVGGPGHKARVIjAEAMSQVQHTNI MMQRGNFRGQKR I KC FNCGKEGHLiARNCRAPRKKG 
CWKCGKEGHQMKDCTERQANFLGKIWPSSKGRPGNFPQSRPEPTAPPAEIFGMGEEITSPPKQEQKDREQDPPIjVSLKSIiFG 

NDPLSQ$ 

3. 2003_CON_A1 gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 

AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCrrCCCTGCTGGA 

GGGCTGCCAGCAGATCATGGAGCAGCTGCAGCCCGCCCTGAAGACCGGCACCGAGGAGCTGCGC^ 

GCCACCCTGTACTGCGTGCACCAGCGCATCGACGTGAAGGACACCAAGGAGGCCCTGGAC^UVGATCGAGGAGATCCAGAACA 

AGTCCAAGCAGAAGACCCAGCAGGCCGCCGCCGACACCGGCAACTCCTCCAAGGTGTCCCAGAACTACCCCA^ 

CGCCCAGGGCCAGATGGTGCACCAGTCCCTGTCCCCCCX3CACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAAC^TGATGCTGAACATCGTGG 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 

CGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGA(^TCGCCGGC^CCACCTCC^CCCCCCAGGAGC^G 

ATCGGCTOGATGACCGGCAACCCCCCCATCCCCGTGGGCGACATCTAC^^ 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT ' 

CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCC 

GACTGCAAGTCCATCCTGCGCGCCCTGGGCCCCGGCGCC^CCCrGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCGGCCAG^GGCCCGCGTGCTGGCOTAGGCCATGTCCCAGGTGCAGC^ 

CGGCCAGAAGCGCATCAAGTGCTTCAACTGCGGCAAGGAGGGCGACCTGGCCCGCA 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGC^CCGAGCGCCAGGC 

CCTCC^GGGCCGCCCCGGCAACOTCCCCCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGATCTTCGGCATGOT 

GGAGATCACCTCCCCCCCCAAGCAGGAGCAGAAGGACCGCGAGGAGGACCCCCCCCTGGTGTCCCTGA 

AACGACCCCCTGTCCCAGTAA 

4. 2003_Al.anc gag. PEP 

MGARASVTjSGGKLDAWEKIRIjRPGGKKKYRLKHLVWASRELERFALNPG 

ATLYCVHQR I EVKDTKEALD KI EE I QNKS KQKTQQAAADTGNS S KVS QNYP I VQNAQGQMVHQS LS PRTLiNAWVKVI EEKAF 
SPEVIPMFSALSEGATPQDIJiIMMIjNIVGGHQAAMQMLKDTINEEAAEWD^^ 
IGWMTGNPPIPVGDIYKRWIIIiGLNKIVRhrrSPVSI 

DCKS I LRAIjGPGATIjEEMMTACQGVGGPGHKARVLAEAMSQVQNTD I MMQRGNFRGPKRI KC FNCGKEGHLiARNCRAPRKKG 
CWKCGKEGHQMKDCTERQANFLGKIWPSSKGRPGNFPQSRPEPTAPPAENFGMGEEMISSPKQEQKDREQYPPLVSLKSLFG 

NDPLSQ$ 

2003_Al.anc gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCG^^ 
AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCC CTGAACCCCGG CCTGCTGGAGACCGCCGA 
GGGCTGCCAGCAGATCATGGGCCAGCTGCAGCCCGCCCTGAAGACCGGCACCGAGGAGCTGCGCTCCCTGTACAA 
GCCACCCTGTACTGCGTGCACCAGCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGATCCAGAACA 
AGTCCAAGCAGAAGACCCAGCAGGCCGCCGCCGACACCGGCAACTCCT'CCAAGGTGTCCGAGAACTAC 

CGCCCAGGGCCAGATGGTGCACCAGTCCCTGTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCTCCGCCCrroTCCGAGGGCGCCACCCCCCAGGACCTGAACATGATGCTGAACATCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 
CGCCGGCCCCATCCCCCCCX^CCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCG^ 

ATCGGCT'GGATGACCGGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTCSGATCATCCTGGGCCTGAACAAGATCG 
TGCGCATGTACTCCCCCGTGTCCATCCTCGACATCCGCC^ 

CAAGACCCT^CGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAAC 
GACTGCAAGTCCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCXXirCC 




CCGGCCACAAGGCCCGCGTGCTGGCCQAGGCCATGTCCCAGGTGCAGAACACCGACATCATGATGCAGCGCGGCAACTTCCG 
CGGCCCCAAGCGCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCCGCGC 
TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAAC^ 
CCTCC^^GGGCCGCCCCGGCAACTTCCCCCAGTCCCGCCCCGAGC^ 

GGAGATGATCT^CCTCCCCCAAGCAGGAGCAGAAGGACCGCGAGCAGTACCCCCCCCTGGTGTCCCTGAAGTCCCTGTTC 
AACGACCCCCTGTCCCAGTAA 

V 5.20 03_CON_A2 gag . PEP 

y\ MGARAS ILSGGKIjDAWEKIRLRPGGKKKYRIjKHIjVWASRELiEKFS INPSLI.ETSEGCRQI IRQLQPALQTGTEEIjKSLYNTV 
ri AVLYCVHQRIDVKDTKEAIjDKIEEEQNKCKQKTQHAAADTGNSSS SSQNYPI VQNAQGQMVHQAI S PRTIiNAWVKWEEKAF 
S PEV I PMFTALSEGATPQDLNTMIjNTVGGHQAAMQMIjKDT INEEAAEWDRLH P VHAG P I PPGQMRE PRGSD I AGTTS TIjQEQ 
I GWMTSNPPI PVGEI YKRW I ILGIiNKIVRMYS PVS ILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKNWMTDTXiLVQNANP 
DCKSILRAIjGPGATLEEMMTACCK^GGPSHKARVIAEAMSQVQNTO 
KGCWKCGKEGHQMKDCTERQANFJjGKIWPSNKGRPGNFPQSRTEPTAPPAEin^ 
FGNDPLSQ$ 



& 



fir 



2003_CON_A2 gag. OPT 

ATGGGCGCCCGCGCCTCC^TCCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTC 
AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGAAGTTCT 

GGGCTGCCGCCAGATCATCCGCCAGCTGCAGCCCGCCCTGCAGACCGGCACCGAGGAGCTGAAGTCCCrreTACAAGACCGT 

GCCGTGCTGTACTGCGTGCACCAGCGCATCGACGTGAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACA 

AGTGCAAGC^GAAGACCCAGC^CGCCGCCGCCGACACCGGCAACT 

CGCCCAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCACCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 
GCGGCCACGAGGCCGCC^TGCAGATGCTGAAGGAC^CCATCAACGAGGAGGCCGCCGAGT^^ 

CGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAG 
ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 
TGCGC^TGTACTCCCCCGTGTCC^TCCTGGAC^TC^ 

CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCC 

GACTGCAAGTCCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCTCCCACAAGGCCCGCX5TGCTGGCCGAGGCCATGTCCCAGGTGCAGAACACCAACACCAACATCATGATGCAGCGCGG 

CTTCCGCGGCCAGAAGCGCATGAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCCGCG CCC CCCGCAAG 

AAGGGCTGCTGGAAGTGCGGGAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCG^ 

GGCCCTCCAACAAGGGCCXSCCCCGGCAACTTCCCCCAGTCCCGC^CC^ 

GGGCGAGGAGATCACCTCCTCCCTGAAGCAGGAGCTGAAGACCCGCGAGCCCTACAACCCCGCCATCTCCCTGAAGTCCCTG 
TTCGGCAACGACCCCCTGTCCCAGTAA 

6. 2003_CON_B gag. PEP 

MGARASVLSGGELDRWEKIRIJIPGGKKKYKLKHIWASREL^ 

ATLYCVHQRI EVKDTKEALiE KI EEEQNKS KKKAQQAAADTGNS SQVSQNY P I VQNIjQGQMVHQAI S PRTLNAWVKWEE KAF 
SPEVIPMFSALSEGATPQDLNTMLNWGGHQAAMQ^ 

IGWMTNNPPIPVGEIYKRWIIIiGIiNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNW 
DCKTILKAIiGPAATIiEEMMTACQGVGGPGHKARVIiAEAMSQVTNSATIIWQRGNFRN 

KGCWKCGKEGHQMKDC^ERQANFI^KIWPSHKGRPGNFIjQSRPEPTAPPEESFRFGEETTTPSQKQEPIDKELYP3J\S$ 



2003_CON_B gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCGAGCTGGACCGCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 
AGTACAAGCTGAAGC^CATCGTGTGGGCCTCCCGCGAGCTGG^ 

GGGCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCTCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAACACCGTG 
GCCACCCTX3TACTGCGTGCACCAGCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAGAACA 
AGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGACACCGGCAACTCCTCCCAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CCTGCAGGGCCAGATGGTGCAXTCAGGCCATCTCCCCCC^ 

TCCCCCGAGGTGATCCCCATGTTCT*CCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACZACCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 
CGCCGGCCCCATCGCCCCCGGCC^GATGCGCGAGCCCCGCGGCTCCGACA^ 

ATCGGCTGGATGACCT^ACAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 
TGCGCATGTACTCCCCCACCTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTA 
OU^GACCCTGCGCGCCGAGCAGGCCTCCCAGGAGGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCC 
GACTGCAAGACCATCCTGAAGGCCCTGGGCCCCGCCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
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CCGGCCACAAGGCCCGCGTGCTGGCCX3AGGCCATGTCCCAGGTGAC 
CCGCAACCAGCGCAAGACCGTGAAGTGCTTGAACTGCGGCAAGGAG^ 

AAGGGCTG CTGG AAGTG CGG C AAGGAGGGC CACCAGATGAAGG ACTGCACCGAGCG CCAGGC CAACTTC CTGGGC AAG ATCT 

GGCCCTCCCACyUlGGGCCGCCCCGGCyUVCTTCCTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGAGGAGTCCTTCC 

CGGCGAGGAGACCACCACCCCCTCCCAGAAGCAGGAGCCCATCX3ACAAGGAGCTGTACCCCCTGGCCTCCTAA 

7. 20O3_B. anc gag . PEP 

MGARASVLSGGKLDKWEKIRIiRPGGKKKYKLKHIV^ 

ATLYCVHQRI EVKDTKEALDKI EEEQNKS KKKAQQAAADTGNS SQVS QNYP I VQNLQGQMVHQAI S PRTLNAWVKWEEKAF 
SPEVI PMFSALSEGATPQDLNTMIjNTVGGHQAAMQMIj^ IAPGQMREPRGSDIAGTTSTLQEQ 
I GWMTNNPP I PVGE I YKRWI I LGLNKI VRMYS P I S I LD I RQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMTETLIjVQNANP 
DCKTIXiKAIjGPAATIjEEMMTACQGVGGPGHKARVIiAEAMSQVTNSTTIMMQRGNFRDQ 

KGCWKCGKEGHQMKDCTERQANFIiGKIWPSHKGRPGNFl^QSRPEPTAPPEESFRFGEETTTPSQKQEPIDKELYPLASIiKSL 
FGNDPSSQ$ 

2003B. anc gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 

AGTAC^GCTGAAGCACATCGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTTC 

GGGCTGCCGCCAGATCCTGGGCCAGCTGCAGCCCGCCCTGCAGAC 

GCCAC CCTGTACTG CGTG CAC CAGCG CATCGAGGTGAAGGACAC CAAGGAGGCCCTGG ACAAGATCG AGGAGG AG CAGAAC A 
AGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGACACCGGCAACTCCTCCCAGGTGTCCGAGAACTACCCCA 
CCTGCAGGGCCAGATGGTGCACCAGGCCATCTCCC CCCGCACCCTG AACGC CTGGGTGAAGGTGGTGGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAAC^CCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 

CGCCGGCCCCATCXSCCCCCGGCC^GATGCGCGAGCCCCGC^^ 

ATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCrrGAACAAGATCG 

TGCGCATGTACTCCCCCATCTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGC 

CAAGACCCTGCGCGCCGAGCAGGCCTCCCAGGACGTGAAGAA 

GACTGCAAGACCATCCTGAAGGCCCTGGGCCCCGCCGCCACCCTGGAGGAGATGATGACCGCCTGCC^ 

CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGACCAACTCCACCACCATCATGATGCAGCGCGGC 

CCGCGACCAGCGCAAGATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCGACATCG 

AAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCT 
GGCCCTCCCACAAGGGCCGCCCTOGCAACTTCCTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGAGGAGTCCOT 
CGGCGAGGAGACCACCACCCCCTCCCAGAAGCAGGAGCCCATCGACAAGGAGCTGTACCCCCTGGCCTCCCTGAAGTCCCTG 
TTCGGCAACGACCCCTCCTCCCAGTAA 

« 8. 2003_CON_C gag - PEP 

MGARAS I LRGGKLDKWEKIRXiRPGGKKHYMLiKHIjVWASRELERFAIjNPGIiLiETSEGCKQI I KQLQPALQTGTEELRSLYNTV 
ATLYCVHEKI EVRDTKEALDKI EEEQNKS QQKTQQAKAADGKVS QNYP I VQNXiQGQMVHQAI S PRTLNAWVKVI EEKAFS PE 
VI PMFTALSEGATPQDIiNTMIiNTVGGHQAAMQMIjKDTINEEAAE^ IAGTTSTTjQEQIAW 
MTSNPPI PVGDI YKRWI I LGLNKI VRMYSPVS IIJ5IKQGPKEPFRDYVDRFFKTI-RAEQATQDVKNWMTDTLLVQNANPDCK 
TILRAIjGPGATLEEMMTACQGVGGPSHKARVIiAEAMSQANN^ 
KCGKEGHQMKDCTERQANFIX3KIWPSHKGRPGNFLQNRPEPTAPPAESF 

2003_CON_C gag. OPT 

ATGGGCGCCCGCGCCTCCATCCTGCGCGGCGGCAAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGC 
ACTACATGCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCX3CTTCGCCCTGAACCCCGGCCTGCTGGAGACCTCCGA 
GGGCTGCAAGCAGATCATGAAGCAGCTGCAGCCCGCCCTGCAGACCGGCACCGAGGAGCT^ 
GCCACCCTGTACTGCGTGCACGAGAAGATCGAGGTGCGCGACACCAAGGAGGCCCT 

AGTCCCAGCAGAAGACCCAGCAGGCCAAGGCCGCCGACGGCAAGGTGTCCCAGAACTACCCCATCGTGCAGAACCTGCAGGG 
CCAGATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTCTCCCCCGAG 
GTGATCCCCATGTTCACCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGT^ACACCATGCTGAACACCGTGGGCGGCCACC 
AGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCXSTGC^ 
CATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAGATCGCCTGG 
ATGACCTCCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGT 
ACTCCCCCGTGTCC^TCCTGGAGATGAAGCAGGGCCCCAAGGAGCCCTTCCGCG 

GCGCGCCG AGCAGGCCACC GAGGACGTG AAG AACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCC CG ACTGCAAG 
ACCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCTCCCACA 



AGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCAACAAC^^ 
GCGCATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGC^ 

AAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGAT 
AGGGCCGCCCCGGCAACTTCCTGCAGAACCGCCCCGAGCCCACCGCC^ 

CCCCGCCCCCy^GC^GGAGCCCAAGGACCGCGAGCCCCTGACCTCCCTGAAGTCCCTGTTCGGCTCCGACCCCCTGT 
TAA 

9 . 2003_C.anc.gag.PBP 

MGARAS I LRGGKIJDTWEKI RIjRPGGKKHYMI KHI*VWASRELERF7UjNPGIjLiETS EGCKQ IMKQLQPAIjQTGTEEIjRS LYNTV 
ATLYCVHERIEVRDTKEAIJDKIEEEQNKSQQKT 
" SPEVIPMFTALSEGATPQDLNTMLNTVGGHQAAMQML 
IAWOTSNPPIPVGDIYXRWIILGLNKIVRMYSPVSI^ 
DCKTILRALGPGATLEEMMTACQGVGGPGHKARV^ 

GCWKCGKEGHQMKDCTERQANFIjGKIWPSHKGRPGNFIjQSRPEPTAPPAESFRFEETTPAPKQEPKDREPIiTSIjKSLiFGSDP 
_SQ$ 

20 0 3_C . anc . gag . OPT 

ATGGGCGCCCGCGCCTCCATCCTGCGCGGCGGCAAGCTGGACACCTGGGA 
ACTACATGATCAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCT^ 

GGGCTGCAAGCAGATCATGAAGCAGCTGCAGCCCGCCCTGC^GACCGGCACCGAGGAGCTGCG 
GCCACCCTGTACTGCGTGCACGAGCGCATCGAGGTGCGCGACACGAAGGAGGCCCT^ 

AGTCCCAGCAGAAGACCCAGCAGGCCGAGGCCGCCGACGGCGACAACGGCAAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CCTGCAGGGCCAGATGGTGCACCAGGCCATCrrCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTG 

TCCCCCGAGGTGATCCCCATGTTCACCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 
GCGGC CACCAGGCCGCCATGCAGATGCTGAAGGACAC CATCAACGAGG AGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 
CGCC^MCCCCGTGGCCCCCGGCCAGATGCGCGAGCCCCGC^ 

ATCGCCTGGATGACCTCO^CCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTGGATCATCCTGGG^ 

TGCGCATGTACTC CCCCGTGTCC ATCCTGGACATCAAGC AGGGCC CCAAGGAGCCCTTCCG CG ACTACGTGGACCGCTTCTT 
CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGACGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCC 
GACTGCAAGACCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCAACAACACC 

GGGCCCCAAGCGCATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAG. 

GGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGAT 

CCTCCC^CAAGGGCCGCCCCGGC^^CTTCCTGCAGTCCC^ 

GGAGACC^CCCCCGCCCCCyU^GCAGGAGCCC^^GGACCGCGAGCCCCTGACCTCCCTGAAGTCCCTGTTCGGCTCCGA^ 
CTGTCCCAGTAA 

* 10. 2003_CON_D gag. PEP 

MGARAS VIjSGGKIjDAWEKIRLRPGGKKKYRLKHIVW AS RELERFALN 
ATLYCVHERIEVKDTKEALEKIEEEQNKSKKKAQQAAADTC 

S PEVI PMFSAXiSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPV^ 

IGWMTSNPP I PVGE I YKRW I X LGIiNKI VRMYS PVS I LD I R(^PKEPFRD YVDRFYKTLRAEQASQDVKNWMTETIjLiVQNANP 
DCKTI LKAIjGPEATIiEEMMTACQGVGGPSHKARVLiAEAMSQATNS AAVMMQRGNFKGPRKI I KC FNCGKEGH I AKNCRAPRK 
KGC^KCGKEGHQMKDCTERQANFLGKIWPSHKGRPGNFLQSRPEPT^ 
GNDPLSQ$ 

2003_CON_D gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAA 
AGTACCGCCTGAAGCACATCGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTC 
GGGCTGCAAGCAGATCAT CGGCCAGCTGCAGCCCGCCATCCAGACCGGCT 

GCCACCCTGTACTGCGTGCACGAGCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGAGAAGATCGAGGAGGAGCAG 
AGTCCAAGAAGAAGGCCCAGCAGGCCGCCGCCGACACCGGCAACTCCTCCCAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CCTGCAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCC^TGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 
GCGGCCACCAGGCCGCCATGC^GATGCTOAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTC 
CGCCGGCCCCGTGGCCCCCGGCCAGATGCGCGAGCCCCGCGGC^ 

ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACA 
TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTA 
CAAGACCCTGCGCGCCGAGCAGGCCTCCCAGGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCC 
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GACTGCAAGACCATCCTGAAGGCCCTG<3GCCCCGAGGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCT^TGTCCCA^ 

CAAGGGCCCCCGCAAGATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCAC^ 

AAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCAC^ 

GGCCCTCCCACAAGGGCCGCCCCGGC^UVCTTCCro 

CGGCGAGGAGATCACCCCCTCCCAGAAGCAGGAGCAGAAGGACAAGGAGCTGTACCCCCTGACCTCCCTGAAGTCCCTGTTC 
GGCAACGACCCCCTGTCCCAGTAA 



11. 2003_CON_F gag. PEP 
MGARASVLSGGKLDAWEKIRIjRPGGKKKYRMKHIjVWASREIjER 
* AVLYCVHQKVEVKDTKKAIjEKXiEEEQNKSQQKTQQAAADKGVSQNYPIVQ 
IPMFSALSEGATPQDLNTMIiNTVGGHQAAMQMIiKDT^ 

TSNPPVPVGD I YKRWI ILGLNKI VRMYS PVS I LDIRQGPKEPFRDYVDRFFKTIJIAEQATQEVKGWMTDTL1*VQNANPDCKT 

ILKALGPGATI^EMMTACQGVGGPGHKARVLAEAMSQATOT 

CGREGHQMKDCreRQANFLGKIWPSNKGRPGNFLQSRPEPTJ^ 



2003_CON_F gag .OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAA 

AGTACCGCATGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGGACCCCGGCCTGCTGGAGACCTCCGA 

GGGCTGCCAGAAGATCATCGGCCAGC^GCAGCCCTCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAACACCGTG 

GCCGTGCTGTACTGCGTGCACCAGAAGGTGGAGGTGAAGGACACCAAGGAGGCCCTGGAGAAGCTGGA 

AGTCCGAGCAGAAGACCCAGCAGGCCGCCGCCGACAAGGGCGTGTCCC^ 

GATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTCTCCCCCGAGGTG 

ATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAAGACCATC 

CCGCCATGCAGATGCTGAAGGACACCATGAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCG 

CCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCC^ 

ACCTCCAACCCCCCCGTGCCCGTGGGCGACATCTACAAGCGCTGGATCATCCTG 

CCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACT^ 

CGCCGAGCAGGCCACCCAGGAGGTGAAGGGCTGGATGACCGACACCCTGCTGGTGC 

ATCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGG 

CCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCACCAACACCGCCATCATGATGCAGAAGTCCAACTTCAAGGGCCAGCGCCG 

CATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCAAGAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGA 

TGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCTCCAACAAGG 

GCCGCCCCGGCAACTTCCTGCAGTCCCGCCCCGAGCCC^CCGCCCCCCCCGCCGAGTCCTTCGGCTTCCGCGAGGAGATCAC 

CCCC^CCCCCAAGCAGGAGCAGAAGGACGAGGGCCTGTACCCCCCCCTGGCCT^ 

TAA 



/ 12. 2003_CON_G gag. PEP 

MGARASVLSGGKIjDAWEKIRLRPGGKKXYRMKHLVWA 

ATLiYCVHQR I EVKDTKEAIjEEVEKI qkks qqktqqaamdegns sqvs qnyp I vqnaqgqmvhq ai s prtlnawvkweekaf 

SPEVIPMFSALSEGATPQDIiNTMIiNTVGGHQAAMQMIiKDTINEEAAETO 

IRWMTSNPP I PVGEI YKRW I ILGLNKI VRMYSPVS ILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKGWMTDTLLVQNANP 
DCKTILRALGPGATLEEMMTACQGVGGPSHKARVIAEAMSQASGAAAAIMMQK^ 
KKGCWKCGKEGHQMKDCTERQANFLGKIWPSNKGRPGNFLQNRPE 
FGSDP$ 

2003_CON_G gag. OPT 

ATGGGCGCCCG CGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGC C CCGGCGGCAAGAAG A 
AGTACCGCATGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGACCTGCTGGAGACCGCCGA 
GGGCTGCCAGCAGATCATGGGCCAGCTGCAGCCCGCCCTGCAGACCGGCACCGAGGAGCTGCGCTCCCTGTTCAACACCGTG 
GCCACCCTGTACTGCGTGCACGAGCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGAGGAGGTGGAGAAGA 
AGTCCCAGCAGAAGACCCAGCAGGCCGCCATGGACGAGGGCAACTCCTCCCAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CGCCCAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGC^GAAGGACACCATCAACGAGGAGGCC 
GGCCGGCCCC^TCCCCCCCGGCCAGATCCGCGAGCCCCGCGGCTCCG^^ 

ATCCGCTGGATGACCTCCJU^CCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 
TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT 



no 



$ 




CAAGACCCTGCGCGCCGAGCAGG CCACCCAGG AGGTGAAGGGCTGGATGACCGACACCCTG CTGGTGCAGAACGCCAACCCC 
GACTGCAAGACCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCTCCGGCGCCGCCGCCGCCATCATGATGCAGAAGTCCAA 
CTTCAAGGGCCCCCGCCGCACCATCAAGTGCTTCAACTOCGGCAAGGAGGGCCAC 

AAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGA 
TCTGGCCCTCCAACAAGGGCCGCCCCGGC^CTTCCTGCAGAACCGCCCCGAGCCCACCGCCCCCCCCG 
. CTTCGGCGAGGAGATCGCCCCCTCCCCCAAGCAGGAGCAGAAGGAGAAGGAGCTGTACCCCCTGGCCTCCCTGAAGTCCCTG 
TTCGGCTCCGACCCCTAA 



2003 CON H gag . PEP 



b 



ft 



MGARASVLSGGKI^AWEKIRLRPGGKKKYRJjKHLVWASREL^ 
AVLYCVHQRIDVKDTKEAIjGKIEEIQNKSQQKTQQA 
SPETVI PMFSALSEGATPQDLNAMiNTVGGHQAAMQMLKDT 

IAWMTGNPPI PVGDIYKRWI ILGLNKIVRMYSPVSII^IKQGPKEPPRDyVDRFFKTLRAEQATQDVKNWMTDTIjL»VQNANP 
DCKTIIJIAIXSQGASIEEMMTACQGVGGPSHKARVLAEAM^ 
' KKGCWKCGRBGHQMKDCTERQANFIXSKIWPSSKGRPGNFLQSRPEPTAPPAESFGFGEEMTPSPKQELKDKEPPIjASIiRSIjF 
GNDPLSQ$ 

2 003_CON_H gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACG 

AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTCCTGGAGACCGCCGA 

GGGCTGCCTGCAGATC^TCGAGCAGCTGGAGCCCGCCATCAAGACCGGCACCGAGGAG 

GCCGTGCTGTACTGCGTGCACCAGCGC7VTCGACGTGAAGGACACCAAGGAGGCCCTGG 

AGTCCCAGCAGAAGACCCyVGCAGGCCGCCGCCGACAAGGAGAAGGACAACAAGGTGTC CCAGAACTAC CCCATCGTGCAGAA 

CGCCCAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGCACCC^ 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGG^^^ 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACC^TCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 
CGCCGGCCCC^TCCCCCCCGGCCAGATCCGCGAGCCCCGCGGCT^ 

ATCGCCTGGATGACCGGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTGGATGATCCTGGGCCTGAACAAGATCG 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT 

CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGACGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCC 

GACTGCAAGACCATCCTGCGCGCCCTGGGCCAGGGCGCCTCCATCGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGACCAACGCCAACGCCGCGATGATGATGCAG 

CTTCAAGGGCCCCCGCAAGATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGC 

AAGAAGGGCTGCTGGAAGTGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCC 

TCTGGCCCTCCTCCAAGGGCCGCCCCGGC^^CTTCCTGC^^ 

CTTCGGCGAGGAGATGACCCCCTCCCCCAAGCAGGAGCTGAAGGACAAGGAGCCCCCCCTGGCCTCCCTGCGCTCCCrrGTTC 
GGCAACGACCCCCTGTCCCAGTAA 

'-^14. 2003_CON_K gag. PEP 

MGARASVLSGGKLDTWEKIRIiRPGGKKKYRIjKHLVWAS 

ATLYCVHQR IEVRDTKEAI-DKLEEEQNKSQQICrQQETADKGVSQNYP I VQNIjQGQMVHQALS PRTLNAWVKV I EEKAFS PEV 
IPMFSAIjSEGATPQDIjNTMI*NTVGGHQAAMQMIjKI>TINEEJ^ 

TSNPPVPVGEIYKRWIILGLNKIVR^SPVSIIiDIRC^PKEPFRDYVDRFFKTLRAEQATQEVra 

I LKAIiGPGASIjEEMMTACQGVGGPGHKARIIiAEAMSQVTNTAVMMQRGNF KGQRKI I KCFNCGKEGH I ARNCRAPRKKGCWK 
CGKEGHQMKDCTERQANFLGKIWPSNKGRPGNFLQSRPEPTAPPAESFGFGEEITPSPRQETKDKEQGPPLTSLKSLFGNDP 
LSQ$ 



2003_CON_K gag. OPT 

'/T ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACACCrGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 
O AGTACCGCCTGAAGCACCTGGTGTGGG CCTCC CGCGAGCTGGAGCGCTTCGCCCTGAACCCCTCC CTGCTGGAGACCACCGA 
GGGCTGCCGCCAGATGATCCGCGAGCTGCAGCCCTCCCTGCAGACCGGC^ 

GCCACCCTGTACTGCGTGCACCAGCGCATCGAGGTGCGCGACACCAAGGAGGCCCTGGACAAGCTGGAGGAGGAGCAGAACA 

AGTCCCAGCAGAAGACCCAGCAGGAGACCGCCGACAAGGGCGTGTCCCAGAACTACCCGATCGTGCAGAACCTGCAGGGCCA 

GATGGTGCACCAGGCCCTGTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTCTCCCCCGAGGTG 

ATCCCGATGTTCTCCGCCCTGTCCGAGGGCGCGACCCCCCAGGACCTGAACA 

CCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGG 

CCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCT 

ACCTCCAACCCCCCCGTGCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGTACT 



7 



ft 



CCCCCGTGTCCATCCTGGACA.TCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTTCAAGACCCTGCG 
CGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACC 
ATCCTGAAGGCCCTGGGCCCCGGCGCCTCCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACAAGG 
CCCGCATCCTGGCCGAGGCCATGTCCCAGGTGACCAACACCGCCGT^ 

GATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAA 
TGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCTCCAAC^ 
. GCCGCCCCGG(^U*CTTCCTGCAGTCCCGCCCCGAGCCCACCGCCC^ 
CCCCTCCCCCCGCCAGGAGACGAAGGACAAGGAGCAGGGCCCCCCCCTGACCTCCC 
CTGTCCCAGTAA 

/ 15. 2003 _CON_0 1_AK gag.PBP 

MGARASVLSGGKIiDAWEKIRIjRPGGKKKyRMKHIiWASRBLERFA 

ATLWCVHQR I EVKDTKEALDKI EEVQNKSQQKTQQAAAGTGS S S KVSQNYP IVQNAQGQMVHQ PLS PRTLNAWVKWEEKGF 
NPEVIPMFSALSEGATPQDLNMMLNIVGGHQAAMQMIjKETINEEAAEWDRVHPVHA 

IGWMTNNPP I PVGD I YKRW 1 1 LGLNKI VRMYS PVS I LD I RQGPKEP FRDYVDRFYKTIiRAEQATQEVKNWMTETLLVQNANP 
DCKS I LKALGTGATIjEEMMTACQGVGGPSHKARVTiAEAMSQAQHAN IMMQRGNFKGQKR I KCFNCGKEGHIiARNCRAPRKKG 
C^TCCGKEGHQMKDCTERQANFIjGKIWPSNKGRPGNFPQSRPEPTAPPAEN^^ 
NDPLSQ$ 



200 3_CON_0 1_AE gag. OPT 

/I ATX^GCGCCCGCGCCTCCGTGCTGTCCGGCGGCy^AGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 
AGTACCGGATGAAG^CCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCT^CG 

GGGCTGCCAGC^VGATCATCGAGCAGCTGCAGTCCACCCTGAAGAC CGGCTCCGAGGAGCTGAAGTC CCTGTTCAACACCGTG 
G CCAC C CTGTGGTGCGTGCAC CAGCG CATCG AGGTGAAGGACACCAAGGAGGC CCTGGACAAGATCGAGGAGGTGC AGAACA 
AGTCCCAGCAGAAGACCCAGCAGGCCGCCGCCGGCACCGGCTCCTCCTCCAAGGTGTCCCAGAAC^rACCCCATCGTGCAGAA 
CGCCCAGGGCCAGATGGTGCACCAGCCCCTGTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGGCTTC 
AACCCCGAGGTGATCCCCATGTTCTCCGCCCriX3TCCX3AGG^ 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCA 
CXSCCGGCCCCATCCCCCCCGGCC^GATGCGCGAGCCCC^ 

ATCGGCTGGATGACCAACAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTGGATC^^ 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCG 
CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGC 
GACTGCAAGTCCATCCTGAAGGCCCTGGGCACCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGG^ 
CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAG 

GGGCCAGAAGCGCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCC^ 
TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTC 
CCAAOUKKKXrCGCCCCGGCAACTTCCCCCAGTCCCGCCCCGAGCC 

GGAGATCACCTCCCTGCCCAAGCAGGAGCAGAAGGACAAGGAGCACCCCCCCCCCCTGGTGTCCCTGAAGTCCCTGTTCGGC 
AACGACCCCCTGTCCCAGTAA 

Qk / ^^- 6 - 2003_CON_02_AG gag. PEP 
0 \ MGARASVL,SGGKLDAWEKIRLRPGGKKKYR1jKHIjWASRELERFAIJ^ 
fif ATIjWCVHQR ID I KDTKEALDKI EEVQNKS KQKTQQAAAATGS S S QNYPI VQNAQGQMTHQSMS PRTLNAWVKVIEEKAFS PE 
VIPMFSALSEGATPQDLNMhUjNIVGGHQAAMQMLKDTINEEAAEWDRVHPVHAGP I PPGQMREPRGSDI AGTTSTIjQEQI GW 
OTSNPPIPVGEIYKRWIVX/SLNKIVRMYSPVSIIiDIRQGPKEPFRDYVDRFFKTTL 
SILRAIK3PGATLEEMMTACQGV(X3PGHKARVIiAEAM^ 

CGKEGHQMKDCTERQANFIiGKI WPS SKGRPGNFPQSRPEPTAPPAES FGMGEE ITS S PKQEPRDKGDYPPLTS LKSLFGNDP 
$ 



e> 



2003_CON_02_AG gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 

AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTKxAA 

GGGCTGCCAGCAGATCATGGAGCAGCTGCAGTCCGCCCTGCGCACC 

GCCACCC^GTGGTGCGTGCACCAGCGCATCGACATCAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGG 

AGTCCAAGCAGAAGACCCAGCAGGCCGCCGCCGCCACCGGCTCCTCCTCCCAGAACTACCCCATCGTGCAGAACGCCCAGGG 

CCAGATGACCCACCAGTCCATGTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTCTCCCCCGAG 

GTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTC 

AGGCCGCCATGCAGATGCTGAAGGACACOATCAACGAGGAGGCCGCCGAGTGGGACCGCGTG 

CATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGAC^TCGCC 
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ATGACC^CCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCGTGCTGGGCCTGAACAAGA TCGT GCGCATGT 
ACTCCCCCGTGTCCATCCTGGACATCCXSCCAGGGCCCCAAGGAGCCCT^ 

GCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAG 

TCCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCACA 

AGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGCAGCAGTCCAACATC^TGATGCAGCGCGGCAACTTCCGCGGCCAGCG 

GACCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCAAGGCCCCCC^ 

TGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCT 

GCCGCCCCGGCAACTTCCCCCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGTCCTTCGGCATGGGCGAGGAGATCAC 
CTCCTCCCCCAAGCAGGAGCCCCGCGACAAGGGCCTGTACCCCCCCCrrGACCTCCCTGAAGTCCCTGTTCGGCAACGACCCC 

TAA 

17. 2 0 0 3 _CON_0 3 _ABG gag . PEP 

MGARASVLSGGKLDAWEKIRLRPGGKKKYRIKHLVWASR 

ATL YCVHQRI E I KDT KRALDKI EE I QNKS KQKTQQAATGTGS S S KVSQNY P I VQNAQGQMTHQSMS PRTLNAWVKVT BE KAF 
SPEVIPMFSALSEGATPQDLNMMI^IVGGHQAAMQML^ 

IGWMTSNPPIPVGDIYKRWIII^IiNKIVRMYSPVSIIiDIRQGPKEPFRDYVDRFFKTIJ^ 
pCKTILRALGSGATI^EMOTACQGVGGPGH^ 

CWKCGKEGHQMKDCTERQANFLGRIWPSSKGRPGNFPQSRPEPSAPPAENFGMGEEITPSIjKQEQKDREQHPPSISLiKSIjFG 
NDPLSQ$ 

2003_CON_03_ABG gag -OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCC^ 

AGTACCGC^TCAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCTCCCTC 

GGGCTGCGAGCAGATCCTGGAGCAGCTGCAGCCCACCCTGAAGACCGGCTCCGAGGAGCTGAAGTCCCTGTACAACACCGTG 

GCCACCCTGTACTGCGTGCACCAGCGCATCGAGATCAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGATCCAGAACA 

AGTCCAAGCAGAAGACCCAGCAGGCCGCCACCGGCACCGGCTCCTCCTCCAAGGTGTCCCAGAACTACCCCATCGTGCAGAA 

CGCCCAGGGCCAGATGACCCACCAGTCCATGTC CCCCCGCACCCTGAACG CCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACATGATGCTGAACATCGTGG 

GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGCCCA 

GGCCGGCCCCTTCCCCCCCGGCC^GATGCGCGAGCCCCGCGGCTCCGAC^TCGCCGGCACCACCTCCACCCTGCAGGAGCAG 

ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGCATGTACTCCCCCGTGTCGATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCX3CTTCTT 

CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCACaAACGCCAAC^ 

GACTGCAAGACCATCCTGCGCGCCCTGGGCTCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGTCCCAGGTGCAGAACGCCAACATCATGATGCAGAAGTCCAA^ 

CGGCCCCAAGCGCATCAAGTGCTTCAACTGCGGC/^GGACGGCCACCTGGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGC 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCCGCATCT^ 

CCTCCAAGGGCCGCCCCGG<^ACTTCCCCCAGTCCCGCCCCGAGCCCTCCGCCCCCCCCGCCGAGAACTTCGGCATGGGCGA 

GGAGATC^CCCCCTCCCTGAAGCAGGAGCAGAAGGACCGCGAGCAGCACCCCCCCTCCATCTCCCTGAAGTCCCTGTTCGGC 

AACGACCCCCTGTCCCAGTAA 

1 

18. 2003_CON_04_CFX gag. PEP 

MGARASVLSGGKLDAWERIRIiRPGGKIOCyRIjKHLVWASREDERFAIiNPGIiL^ 

ATLWC VHQR IDVKDTKEALiDKVEEMQNKS KQKTQQAAADTGGS SNVS QNY P I VQNAQGQMVHQS I S PRTLNAWVKV I EEKAF 
S PEVI PMFSAIiSEGATPQDIiNMMLNI VGGHQAAMQMIiKDTINEEAAEWDRAHPVHAGPIPPGQMREPRGSDIAGTTSTL.QEQ 
IGWMTSNPPIPVGEI YKRWI ILGLNKI VRMYSPVS IIJ3IRQGPKEPFRDYVDRFFKCXRAEQATQEVK1JWMTETLLVQNANP 
DCKS ILKALGTGATLEEMMTACQGVGGPSHKARVLAEAMSQASNAAAAIMMQKSNFKGQRRI IKCFNCGKEGHLiARKCRAPR 
KKGCWKCGKEGHQMKDCTERQANFIjGRMWPSSKGRPGNFIjQSRPEPTAPPAESLEMKEETTSSPKQEPRDKEIjYPLiTSLKSL 

FGSDPLSQ$ 

2003 CON_04_CFX gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTGCGCCCCGGCGGCAAGAAGA 
AGTACCGCCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTGCTGGAGACCGCCGA 
GGGCTGCCAGCAGCTGATGGAGCAGCTGCAGTCCACCCTGAAGACCGGCTCCGAGGAGCTGAAGTCCCTGTTCAACACCATC 
GCCACCCTGTGGTGCGTGCACCAGCGCATCGACGTGAAGGACACCAAGGAGGCCCTGGACAAGGT^ 

AGTCCAAGCAGAAGACCCAGCAGGCCGCCGCCGACACCGGCGGCTCCTCCAACGTGTCCCAGAACTACCCCATCGTGCAGAA 
CGCCCAGGGCCAGATGG1X3CACCAGTCCATCTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCTCCX3CCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACATGATGCTGAACATCGTGG 
GCGGCCACCAGGCC^CCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGCCCACCCCGTGCA 



CGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGAC^^ 
ATCGGCTGGATGACCTTCCAACCCCCCCATCCCCGTGG 
TGOK^TGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCC 
CAAGTGCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGA^ 

GACTGCAAGTCCATCCTGAAGGCCCTGGGCACCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCTCCAACGCCGCCGCCGCCAT 
„ CTTCAAGGGCCAGCGCCGCATCATCAAGTGCTTCAACTGCGGCT^ 
AAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCG 
TGTGGCCCTCCTCCAAGGGCCGCCCCGGGAACTTCCTGCA^ 
GATGAAGGAGK1AGACCACCTCCTCCCCCAAGCAGGAGCCCCGCGAGW 
" TTCGGCTCCGACCCCCTGTCCCAGTAA 



19. 20 03_CON_06_CPX gag.PEP 
MGARASVLSGGKIiDEWEKIRIiRPGGKKKYRI.KHIiWASRELERFAIiNPG 

ATIj YCVHQR I KVTDTKEAIjDK IEE I QNKS KQKAQQAAAATGNS SNLSQNYP I VQNAQGQMVHQAI S PRTLNAWVKVI EEKAF 
SPEVIPMFSALSEGATPQDLNMMLNIVGGHQAAMQMLKDTINEEAAEVTORVHPVHAGPIPPGQ 

IGWMTSNPPIPVGEI YKRWI IIjGLNICIVRMYSPVS ILDIRQGPKEPFRDYVDRFFKTLRAEQATQEVKNWMTDTIiLiVQNANP 
DCKTILKAIX3PGATLEEMMTACQGVGGPGHKARVLAEAMSQASGTEAAIMMQKSNFKGP 

KKGCWKCGKEGHQMKDCTERQANFLGKIWPSNKGRPGNFIjQNRPEPTAPPAESFGFGEETAPSPKQEPKEKELYPIjASIjKSL 
FGNDP$ 

2003_CON_06_CPX gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGC^VAGCTGGACGAGTGG 
AGTACCGCCTGAAGC^CCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTO 

GGGCTGCGAGGAGATCATCGAGGAGCTGCAGTCCGCCCTGAAGACCGGCTCCGAGGAGCTGAAGTCCCTGTTACAACACCGTG 
GCC^CCCTGTACTGCGTGCACCAGCGCATOVAGGTGACCGACACCAAGGAGGCCCTGGA 
AGTCCAAGCAGAAGGCCCAGGAGGCCGCCGCCGCCACCGGCAACTCCTCCAACCTGTC 
CGCCCAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGC 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACATGATGCTGAACATCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCA 
CGCCGGCCCC^TCCCCCCCGGCCAGATGCGCGAGCCCOTCGGCT 

ATCGGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCG 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT 

CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCC 

GACTGCAAGACCATCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 

CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGCCTCCGGCACCGAGGCCGCCATCATGATGCAGAAGTCCAA 

CTTCAAGGGCCCCAAGCGCTCCATCZAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCCGCG^ 

AAGAAGGGCTG CTGGAAGTGCGGCAAGG AGGGCCAC CAGATGAAGGACTG CACCGAGCGCCAGG CCAACTTC CTGGGCAAGA 

TCTGGCCCTCC^UICAAGGGCCGCCCCGGCAACTTCCTGC^^ 

CTTCGGCGAGGAGACCGCCCCCTCCCCCAAGCAGGAGCCCAAGGAGAAGGAGCTGTACCCCCTGGCCTCCCTGAAGTCCCTG 
TTCGGCAACGACCCCTAA 



'20. 2003 _CON0 7 _BC gag.PEP 
MGARAS IIiRGGKLDKWEKIRLRPGGKKHYMIjKHLWASRELERFALNPGIjLETSEGCKQI IKQLQPALQTGTEELRSLFNTV 
ATIiYCVHTEIDVRDTKEALDKIEEEQNKI QQKTQQAKEADGKVSQNYP I VQN3JQGQMVHQP I S PRTLNAWVKWEEKAFS PE 
VIPMFSALSEGATPQDIiNTMIjNTVGGHQAAMQIIjKDTINEEAAEWDRIjHPVHAGP 

MTSNPPVPVGDI YKRWI ILGLNKI VRMYSPTS ILDIKQGPKEPFRDYVDRFFKTIiRAEQATQDVKNWM^ 
TILRALGPGASIEEMMTACQGVGGPSHKARVIiAEAMSQTNSTIL 

CGKEGHQMKDCTERQANFIjGKIWPSHKGRPGNFIjQSRPEPTAPPEESFRFGEETTTPSQKQEP,IDKE 
SSQ$ 

2003_CON_07_BC gag. OPT 

ATGGGCGCCCGCGCCTCCATCCTGCGCGGCGGCAAGCTGGAGAAGTGGGAGAAGATCCGCCT^ 
ACTACATGCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCG^^ 

GGGCTGCi\AGCAGATCATCAAGCAGCTGCAGCCCGCCCTGCAGACCGGCACCGAGGAGCTGCGCTCCCTGTTCAACACC 

GCCACCCTGTACTGCGTGCACACCGAGATCGACGTGCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACA 

AGATCCAGCAGAAGACCCAGCAGGCCAAGGAGGCCGACGGCAAGGTGTCCCAGAACTACCCCATCGTGCAGAACCTGCAGGG 

CCAGATGGTGCACCAGCCCATCTCCCCCCGCACCCT 

GTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACC^ 
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AGGCCGCCATGCAGATCCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCACGCCGGCCC 
CATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGAC^^ 

ATGACCTCCAACCCCCCCGTGCCCGTGGGCGACATCTACAAGCGCTGGATCATCCTGGGCCTt^ 

ACTCCCCCACCTCCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTTCAAGACCCT 

GCGCGCCGAGCAGGCCACCCAGGACGTGAAGAACTGGATGACCGACACCCTGCTGGTGCAGAACGCGAACC 

ACCATCC^GCGCGCCCTGGGCCCCGGCGCCTCCATCGAGGAGATGATGACCGCCTGCCAGGGCG TGGG CGGCCCCTCCOVCA 

. AGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGACCAACTCC^^ 
CATCGTX3AAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCC 
TGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCA 
GCCGCCCCGGCAACTTCCTGCAGTCCCGCCCCGAGCCC 

" GACCCCCTCCCAGAAGCAGGAGCCCATCGACAAGGAGCTGTACCCCCTGACCTCCCTGAAGTCCCTGTTCGGCAACGACCCC 

TCCTCCCAGTAA 

0 21. 2003_CON_08_BC gag. PEP 

MGARA£ILRGGKLDKWEKIRIjRPGGKKHYMIjK3IL^ 

ATL YCVHAE I EVRDTKEALD KI EEEQNKX QQKTQQAKEADEKVS QNYP I VQNLQGQMVHQPLS PRTLNAVfVKVVEEKAFS PE 
VI PMFTAIj S EGATPQDIjNTMIjNTVGGHQAAMQMLKDTINEEAAEWDRIjHPVHA 

MTNNPP I PVGE I YKRW I ILGLNKI VRMYS PTS I LiD I KQGPKE PFRDYVDRFFKTLiRAEQATQDVKNWMTDTTjIjVQNANPDCK 

TILRALGPGASLEEMhTTACQGVGGPSHKARVIiAEAM^ 

CGKEGHQMKDCTERQANFIXSKIWPSHKGRPGNFLQSRPEPTAPPAESra 

2003_CON_08_BC gag. OPT 

ATGGGCGCCCGCGCCTCCATCCTGCGCGGCGGCAAGCTGGACAAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGC 
ACTACATGCTGAAGCACCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGGCCTGCTGGAGACCTCCGA 
GGGCTGCAAGCAGATCATCAAGCAGCTGCAGCCCGCCCTGCAGAC 

GCCACCCTGTACTGCGTGCACGCCGAGATCGAGGTGCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACA 
AGATCCAGCAGAAGACCCAGCAGGCCAAGGAGGCCGACGAGAAGGTGTCCCAGAACTACCCCATCGTGCAGAACCTGCAGGG 
CCAGATGGTGCACCAGCCCCTGTCCCCCCGCACCCrrGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCl^ 
GTGATCCCCATGTTCACCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGGGCGGCCACC 
AGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGC^ 
CGTGGCCCCCGGCC^GATGCGCGAGCCCCGCGGCTCCGACATC 

ATGACCAACAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCGTGCGCATGT 
ACTCCCCCACCTCCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTTCAAGACCCT 
GCX5CGCCGAGCAGGCCACCCAGGACGTGAAGAACTGGATGACCGAC7VCCCTGCTGGTC 
ACCATCCTOCGCGCCCTGGGCCCCGGCGCCTCCCIKK3AGGAGATGATGACCGCOT 
AGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGACCAACAAC^^ 

CATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCAAGAACTGCCGCGCCCCCCGCAAGAAGGGC 
TGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCTC 
GCCGCCCCGGCy^CTTCCTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGTCCTTCCGCTTCGAGGAGACCACCCC 
CGCCCCC^^GCAGGAGCCCAAGGACCGCGAGCCCCTGACCTCCCTG 



MGARASVIiSGGKLDEVraKIRIJlPGGKKKYRLKHL^ 

ATIjYCVHER I KVTDTKEALDKI EEEQTKS KKKAQQATADTGNS SQVSQNYP I VQNIjQGQMVHQPLiS PRTLNAWVKVI EEKAF 
SPEVIPMFSALSEGATPQDI^NTMXJJTTVGGHQAAMQMLKE 

IRWMTSNPPIPVGE I YKRW 1 1 LGLNKI VRMYS P VS ILDIRQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMTETIiLVQNANP 
DCKTILKAIjGPAATLEEMMTACQGVGGPSHKARVLiAEIAMSQATSGNAIMMQRGNFKGPKKI I KCFNCGKEGHI AKNCRAPRK 
KGCWKCGREGHQMKDCTERQANFI*GKIWPSNKGRPGNFIiQSRPEPTAPPAESFGFGEEITPSQKQEQKDKELHPI*ASLKSIjF 

GNDPLSQ$ 

2003_CON_10_CD gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGAGTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 
AGTACCGCCTGAAGC^CCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTrTCGCCCTGAACCCCGGCCTGCTGGAGACCTCCGA 
GGGCTGCAAGCAGATCATCGGCCAGCTGCAGCCCGCCATCCAGACCGGCTCCGAGGAGATCAAGTCCCTGTACAACACCGTG 
GCCACCCTGTACTGCGTGCACGAGCGGATCAAGGTGACCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGACCA 
AGTCCAAGAAGAAGGCCCAGCAGGCCACCGCCGACACCGGCAACTCCTCCCAGGTGTCCCAGAACTACCCCATCGTGCAGAA 
CCTGCAGGGCGAGATGGTGCACCAGCCCCTGTCCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAGAAGGCCTTC 
TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCA 




22. 200 3_CON_l O CD gag. PEP 
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GGCCGGCCCCGTGGCCCCCGGCCAGATCCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAG 
ATCCGCTGGATGACCTCCAACCCCCCCATCCCCGTCGGCGAGATCTAC^ 

TGCGCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTA 
CAAGACCCTGCGCGCCGAGCAGGCCTCCCAGGACGTGAAGAACTGGATGACCGAGACCCTGCTGGTGCAGAACGCCAACCCC 
GACTGCAAGACCATCCTGAAGGCCCTGGGCCCCGCCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
CCTCCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCGAGGCCA 
. CAAGGGCCCCAAGAAGATGATGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCAAG 

AAGGGCTGCTGGAAGTGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCT 
GGCCCTCCAACAAGGGCCGCCCCGGC^ACTTCCTGCAGTCCCGCC^ 

CGGCGAGGAGATCACCCCCTCCCAGAAGCAGGAGC^GAAGGACAAGGAGCTGCACCCCCTGGCCTCCCTGAAGTCCCT 
" GGCAACGACCCCCTGTCCCAGTAA 



23. 20 03_CON_11__CPX gag. PHP 

gag . PEPMGARASVLSGGKI^AWEKIRLRPGGKJCKYIUjKHLWASREL 
RSLYNTVATLYCVHHRIEVKDTKEAIiDKIEEIQNKSKQKKQQAAADTGNSSKVS 

WEEKAFS PEVI PMFS ALSEGATPQDLNMMLNI VGGHQAAMQMLKDTINEEAAEWt)RVHP VHAGP I PPGQMREPRGSDI AGT 
TSTLQEQIGWMTGNPPVPVGEI YRRWI ILGIiNKI VRMYSPVS ILDIRQGPKEPFRDYVDRFFKTIjRAEQATQEVKSWMTETIj 
LI QNANPDCKS I LRALGPGATLEEMMTACQGVGGPGHKARVIaAEAMSQVC^TN IMMQRSNFKGQKRI KCFNCGKEGHIARNC 
RAPRKKGCWKCGKEGHQMKDCTERQANFIXSKIWPSSKGRPGNFL^ 
LKSIjFGSDPIiSQ$ 

200 3_CON_11_CPX gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGC^^GCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGG 

AGTACCGCeiXSAAGC^CCTGGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCCTGAAC 

GGGCTGCCAGCAGATCATGGGCCAGCrreCAGCCCX3CCCTGGGCAC 

GCCACCCTGTACTGCGTGCACCACCGCATCGAGGTGAAGGACACCAAGGAGGCCCTGGACAAGATCGAGGAGA 

AGTCCAAGCAGAAGAAGCAGCAGGCCGCCGCCGACZACCGGCAACTCCTCC^AGGTGTCCCAGAACTACCCCATCGTGCAGAA 

CGCCGAGGGCCAGATGGTGCACCAGGCCATCTCCCCCCGCACCCTGAACGCCTGGGTGA 

TCCCCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCC^GGACCTGAACATGATGCTGAACATCGTGG 
GCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAGACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCA 
CGCCGGCCCCATCCCCCCCGGCCAGATGCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCCACCCTGCAGGAGCAG 
ATCGGCTGGATGACCGGCAACCCCCCCGTGCCCGTGGGCGAGATCTACCGCCGCT^ 

TGCGCATGTACTCCCCCGTGTCC^TCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTT 
CAAGACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGTCCTGGATGACCGAGACCCTGCTGATCCAGAACGCCAACCCC 
GACTGCAAGTCGATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCC 
CCGGCCACAAGGCCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGCAGCAGACCAACATCATGATC 

GGGCCAGAAGCGCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGC 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTG 

CCTCC^GGGCCGCCCCGGCAACTTCCTGCAGTCCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGTCCTTCGGCT^ 

GGAGATCGCCCCCTCCCCCAAGCAGGAGCCCAAGGAGAAGGAGCTGTACCCCCTGACCTCCCTGAAGTCCCTGTTCGGCTCC 

GACCCCCTGTCCCAGTAA 



20 03_CON_12_BF . gag . OPT 

ATGGGCGCCCGCGCCTCCGTGC^FGTCCGGCGGCGAGCTGGACCGCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGT^AGA 
AGTACCGCCTGAAGCACATCGTGTGGGCCTCCCGCGAGCTGGAGCGCTTCGCCGTGAACCCCGGCCTGCTGGAGACCTCCGA 
GGGCTGCCGCAAGATCATCGGCCAGCTGCAGCCCTCCCTGCAGACCGGCTCCGAGGAGCTGCGCTCCCTGTACAACACCATC 
GCCGTGCTGTACTTCGTGC^CCAGAAGGTGGAGGTGAAGGACACCAAGGAGGC 

AGTCCCAGCAGAAGACCCAGCAGGCCGCCGCCGACAAGGGCGTGTCCCAGAACTACCCCATCGTGCAGAACCTGCAGGGCCA 
GATGGTGCACCAGGCCCTGTCCCCCCGCACCCrTGAACGCCTGGGTGAAGGTGGTGGAGGAGAAGGCCTTCTCCCCCGAGGTG 
ATCCCGATGTTCTCCGCCCTGTCCGAGGGCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGGGCGGCCACCAGG 





24. 2003 CON12 BP . gag . PEP 



MGARASVIaSGGELDRWEKIRLRPGGKKKYRIjKHIVWASREIjERFAVNPGIiljETSEGCRKIIGQLQP 

AVIiYFVHQKVETVTKDTKEALDKLEEEQNKSQQKTQQAAADKGVSQNYP I VQNLQGQMVHQAIjSPRTLiNAWVKWEEKAFS PEV 

IPMFSALSEGATPQDLNTMIjm^GGHQAAMQMIiKDTINEEAAEWDRLHPVHA 

TSNPPVPVGEIYKRWIILGLNKIVRMYSPVSILDIRQGPKEPFRDYVDRFFCT 

I LKALGPGATLEEMMTACQGVGGPGHKARVLiAEAMS QVTNTTVMMQKSNFKGQRR I VKC FNCGKEGH I AKNCRAPRKKGCWK 
CGREGHQMKDCTERQANFIiGKIWPSNKGRPGl^IiQNRPEPTAPPAESFGFGEEITPSPKQEQKDEGLYPPliASLKSLFGNDP 

$ 
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CCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCCTGCACCCCGTGCACGCCGGCCCCAT 

CCCCCCCGGCC^GATGCGCGAGCCCCGCGGCTCCGACATCGCC^ 

ACCTCC^UVCCCCCCCGTGCCCGTGGGCGAGATCTAC^ 

CCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTTCAAGACCCTGCG 

CGCCGAGCAGGCCACCCAGGAGGTGAAGGGCT^GATGACCXSACACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACC 

ATCCTGAAGGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCGGCCZACAAGG 

CCCGCGTGCTGGCCGAGGCCATGTCCCAGGTGACCAACACCACCGTGATGATGCAGAAGTCC 

CATCGTGAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCAAGAACIK^CGCGCCCCCCGCAAGAA 

TGCGGCCGCGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCTCCAACAAGG 
GCCGCCCCGGCAACTTCCTGCAGAACCGCCCCGAGCCCACCGCCCCCCCCGCCGAGTCCTTCGGCTTCGGCGAGGAGATCAC 
CCCCTCCCCCAAGCAGGAGCAGAAGGACGAGGGCCTGTACCCCCCCCTGGCCTCCCTGAAGTCCCTGTTCGGCAACGACCCC 
TAA 

25. 2003_CON_14_BG gag. PEP 
MGARASVTjSGGKIiDAWEKIRIJ^PGGKKKY 

ATXjYCVHQKIEVKDTKEALEEVEKAQKKSQKKQQAAMDEGNNSQASQNYP I VQNAQGQMVHQAI S PRTLNAWVKWEEKAFS 
PEVIPMFSAIiSEGATPQDIiNTMLNTVGGHQAAMQMLKOT 

RWMTSNPPIPVGEIYKRWI ILGLiNKIVRMYS PVSILDIRQGPKEPFRDYVDRFFKTIjRAEQATQEVKGWMTDTIjLiVQNANPD 

CKTTILRALGPGATIjEEMMTACQGVGGPSHKA^VIiAEAM 

OTKCGKEGHQMraCTESKANFLGKIWPSNKGRPG 

DP$SQ$ 

2003_CON_14_BG gag. OPT 

ATGGGCGCCCGCGCCTCCGTGCTGTCCGGCGGCAAGCTGGACGCCTGGGAGAAGATCCGCCTGCGCCCCGGCGGCAAGAAGA 
AGTACCGCATGAAGCACCTGGTGTGGGCCrrCCCGCGAGCTGGAGCGCTTCGCCCTGAACCCCGACCTGCTGGAGACCGCCGA 
GGGCTGCCAGCAGATCATGGGCCAGCTGGAGCCCGCCCTGCAGACCGGCAC^ 

GCCACCCTGTACTGCGTGCACCAGAAGATCGAGGTGAAGGACACCAAGGAGGCCCTGGAGGAGGTGGAGAAGGCCCAGAAGA 
AGTCCCAGAAGAAGCAGCAGGCCGCCATGGACGAGGGCAACAACTCCCAGGCCTC CGAGAACTACCC CATCGTGCAGAACGC 

CCCGAGGTGATCCCCATGTTCTCCGCCCTGTCCGAGX3GCGCCACCCCCCAGGACCTGAACACCATGCTGAACACCGTGGGCG 

GCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCATGCACCCCCAGCAGGC 

CGGCCCCATCCCCCCCGGCC^GATCCGCGAGCCCCGCGGCTCCGACATCGCCGGCACCACCTCC^CCCTGCAGGAGCAGATC 

CGCTGGATGACCTCCAACCCCCCCATCCCCGTGGGCGAGATCTACAAGCGCTGGATCATCCTGGGCCTGAACAAGATCX3TGC 

GCATGTACTCCCCCGTGTCCATCCTGGACATCCGCCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGCTTCTTCAA 

GACCCTGCGCGCCGAGCAGGCCACCCAGGAGGTGAAGGGCTGGATGACCGACACCCTGCTGGTGCAGAACGCCAACCCCGAC 

TGCAAGACCATCCTGCGCGCCCTGGGCCCCGGCGCCACCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCT 

CCCAC^^GGCCCGCGTGCTGGCCGAGGCC^TGTCCCAGGCCTCCGGCGCCACCATCATGATGCAG 

CCCCCGCCGCAACATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACCTGGCC 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCAC^ 

CCAACAAGGGCCGCCCCGGCAACTTCCTGCAGAACCGCCCCGAGC^ 

GGAGATCGCCCCCTCCCCCAAGCAGGAGCCCAAGGAGAAGGAGATCTACCCCCTGGCCTCCCTGAAGTCCCTGTTCGGCTCC 
GACCCCTAATCCCAGTAA 
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31. 2003 CONS nef.PBP 

MGGKWS KS S I VGW PAVRERI RRTPPAAE GVGAVSQDLDKHGAI TS SNTAATNADCAWLEAQEBEEVGF PVRPQVPLRPMTYK 
GAFDLSHFLKEKGGLDGLIYSKKRQEIIJ5LWV^ 
LHPMCQHGMEDEDREVLMWKFDSRIALRHIARELHPEFYKDC$ 

200 3 CONS nef -OPT 

- ATGGGCX^CWVGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCCCCCCCGCCGCCG 
AGGGCGTGGGCGCCGTGTCCC^GGACCTGGACAAGCACGGCGCCATCACCTCCTCCAACACCGCCGCCACCAACGCCGACTG 
CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GGCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 
" TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 
GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTGCCTG 
CTGGACCCGATGTGCCAGCACGGGATGGAGGACGAGGACCGCGAGGTGCTGAT 
GCCACATCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

r\ A1032. 2003 M. GROTJP.anc nef .PEP 

Or t O MQGKWSKS S I VGWPAVRERMRRTAPAAEGVGAVSQDIjDKHGAI TS SNTAATNADCAWLEAQEEEEVGFPVRPQVPLiRPMTYK 

0 Pf aafdlshflkekggijdgliyskkrqeii^lwvyhtcxsyfpdwqnytpgpgirypltfgwc 

/' I^PMCQHGMEDEEREVIjMWKFDSRXiALiRHIAREIjHPEFYKDC$ 



ft 



2003 M GROUP, anc nef .OPT 

ATGGGCGGC^GTGGTCCAAGTCCTCC^TCGTGGGCro 

AGGGCGTGGGCGCCGTGTCCCAGGACCTGGAC^GCACGGCGC 

CGCCTGGCTCK3AGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTC 

GCCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 

TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 

GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTGC^ 

CTGCACCCCATGTGCCAGCACGGCATGGAGGACGAGGAGCGCGAGGTGCTGATGTGGAAGTTCGACTCCCGCCTGGCCCTGC 

GCCACATCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 
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r 33. 2003 CON_A nef.PBP 

MGGKWSKSS I VGWPD I RERI RRTP PAAKGVGAVSQDIjDKYGAVT INNTAATQAS CAWLE AQEEEEEVGFP VRPQVPLRPMTF 

kgafdlsfflkekggi^gliysqkrqeildlwvyot^ 

LLHPICQHGMDDEEKEVIiMWKFDSRXiARRHIA1iEMHPEFYKDC$ 
2003 CON A nef. OPT 

atgggcgg^gtotctcc^gtcctccatcgtgggc^ 

agggcgtgggcgccgtgtccc^ggacctggacaagtacggcgccgtgaccatcaacaac^ccgccgc 
cgcctggctggaggcccaggaggaggaggaggaggtgggcttccccgtgcgc 

aagggcgccttcgacctgtccttcttcctgaaggagaagggcggcctggacggcctgatctactcccagaagcgccaggaga 
tcctggacctgtgggtgtacaacacccagggctacttccccgactg^ 

cctgaccttcggctggtgcttcaagctggtgcccgtggaccccgacgaggtggaggaggccaccgagggc^ 
ctgctgcaccccatctgccagcacggcatggacgacgaggagaaggaggtgctgatgtggaagttcgactcccgcctggccc 
gccgccacatcgccctggagatgcaccccgagttctacaaggactgctaa 

^ 34. 20 03 CON Al nef.PBP 

'\Hc ,00 mggkwsks 5 IV gwpevrermrrtppaatgvgavsqd 

W s\ LJ)LSHFLKEKGGIiDGLIYSRKRQEIIiDLWVYHT^ 

fl P I CQHGMDDEEREVLiKWKFDS RLALKHRAQEIiHPEF YKDC $ 

2003 CON Al nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGAGGTGCGCGAGCGC^TGCGCCGCACCCCCCCCGCCGC^ 
CCGGCGTGGGCGCCGTGTCCCAGGACCTGGACAAGCACGGCGCCGTGACCTCCTCCAACATCAACCACCCCTCCTGCGTGTG 
GCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGGCGCC 
CTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCCGCAAGCGCCAGGAGATCCTGGACC 
TGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCTGACCTT 
CGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGACGAGGTGGAGAAGGCCACCGAGGGCGAGAACAACTCCCTGCTGCAC 
CCCATCTGCCAGCACGGCATGGACGACGAGGAGCGCGAGGTGCTGAAGTGGAAGTTCGACTCCCGCCTGGCCCTGAAGCACC 
GCGCCCAGGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

14 



it 
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35. 2003_Al.anc nef .PEP 

MGGKWSKSSIVGWPEVRERMRRTPPAAKGVGAVSQDI^ 
GAFDLSHFLKEKGGLIXSLIYSKKRQEIDDIiWVYHT^ 
LHPI CQHGMDDEERE V1iMWKFDSRLiAIjKHRAREIjHPEFYKDC$ 

. 2003_Al.anc nef . OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCX^GGTGCXSCGAGCGCATGCGCCGCACCCCCCCCGC 

AGGGCGTGGGCGCCGTGTCCCAGGACCnKSAGAAGCACGGCGCCG 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GGCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGAGA 
TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 
GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGCCGAGGTGGAGGAGGCCACCGAGGGCGAGAACAACTCCCTG 
CTGCACCCCATCTGCCAGCACGGCATGGACGACGAGGAGCGCGAGGTGCTGATGTGGAAGTTCGACTCCCGCCTGGCCCTGA 
AGCACCGCCaCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

VI 36 - 2003_CON_A2 nef . PEP 
4 1 0 MGGKWS KS S I VGWPAI RERMRKRTP PAAEGVGAVS QDLATRGAVTS SNTAATNPDCAWliEAQEEEEVGFPVRPQVPLRPMTF 
0 rrmv™ - t* t I RTcn^n naT.l zrOl RODTLDLWVYHTOGYFPDWONYT^ 



KGAFDLSHFLKEKGGIiDGLIYSQKRQDIIiDLWVYHT^YFPDWQNYTPGPGTRYPLTFGWCFK^ 
IjIjHPICQHGIE^PEREVIjRWKFDSRI*AIiRHRAREIjHPEFYKDC$ 



b 



200 3CONA2 nef . OPT 

ATGGGCGGC^GTGGTCCAAGTCCTCCATCGTGGGCTX3GCCC^ 

CCGAGGGCGTGGGCGCCGTGTCCCAGGACCTGGCCACCCGCGGCGCCGTGACCTCCTCCAACACCGCCGCCACCAACCCCGA 
CTGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTTC 
AAGGGCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCCAGAAGCGCCAGGACA 
TCCTGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCACCCGCT 
CCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCTCCGAGGTGGAGGAGGCCACCGAGGGCGAGAACA 
CTGCT*GCACCCCATCTGCCAGCACGGCATCGAGGACCCC^AGCGCGAGGTGCTGCGCTGGAAGTTCGACTCCCGCCTGGCCC 
TGCGCCACCGCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

'A ^?^ 37 - 2003_CON_B nef. PEP 

^2 ' MGGKWS KRS\A7^GWPTVRERMRRAEPA7^DGVGAVSRDIjEKHGAI TS SNTAANNADCAWLEAQEEEEVGFPVRPQVPLRPMTYK 
0 0[ GAIjDLSHFLKEKGGLEGLIYSQKRQDIIJDIjVA/YHTQGYFPDWQNYTPGPGIRYPLTFGWCFKL 
I ' IJIPMSLHGMDDPEREVIjWKFDSRIjAFHHMAREIjHPEYYKDC$ 



& 



D 



2003_CON-B nef. OPT 

ATGGGCGGCAAGTGGTCC^UVGCGCTCCGTGGTGGGCTGGCC 

ACGGCGTGGGCGCCGTGTCCCGCGACCTGGAGAAGCACGGCGCCATCACCTCCTCCAACACCGCCGCCAACAACGCCGA 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 

GGCGCCCTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCCAGAAGCGCCAGGACATCC 

TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCX^CTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 

GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGAGCCCGAGAAGGTGGAGGAGGCCAACGAGGGCGA 

CTGCACCCCATGTCCCTGCACGGCATGGACGACCCCGAGCGCGAGGTGCTGGTGTGGAAGTTCGACTCCCGCCTGGCCTTCC 
ACCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

38. 2003_B.anc nef .PEP 

MGGKWS KS SMGGWPAVRERMKRAE PAADGVGAVS RDLEKHGA ITS SNTAATNADCAWLiEAQEEEEVGFPVRPQVPLiRPMTYK 
AALDIjSHFLKEKGGLEGIjIYSQKRQDILDIjWVYHTQGYFPDWQNYTPGPGIRYPIjTFGWCFKIjVPVEPEKVEEAT 

lhpmcqhgmddpekevlwkfdsrlafhhmarelhpeyykdc$ 

2 0 0 3 _B . arte nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATGGGCGGCTGGCCCGCCGTGCGCGAGCGCATGAAGCGCGCCGAGCCCGCCG 
ACGGCGTGGGCGCCGTGTCCCGCGACCTGGAGAAGCACGGCGCCATCACCT 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GCCGCCCTGK3ACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCCAGAAGCGCCAGGACATCC 
TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 
GACCT^CGGCTGGTGCTTCAAGCTGGTGCCCGTGGAGCCCGAGAAGGTGGAGGAGGCCACCGAGGGCGAG 
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CTGGACCCCATGTGCCAGCACGGCATGGACGACCCCGAGAAGGAGGTGCTGGTGTGGAA 
ACCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

^^39. 2003_CON_02_AG nef .PEP 

MGGKWSKSS I VGWPKVRERIRQTPPAATGVGAASQDDDRHGAITS SNTAATNADCAWLEAQEEEEVGFPVRPQVPIiRPMTYK 
AAVDLSHFLKEKGGLEGLIYSKXRQEILDIjWVYHTQGFFPDWQNYTPGPGTRFPLTFGWCFKLV 
LHPICQHGMEDEDREVIjVWRFDSSLAFKHRARELHPEFYKDC$ 



2003 _CON_0 2 _AG nef - OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCAAGGTGCGCGAGCGCATCCGCCAGACCCCCCCCGCCGCCA 
CCGGCGTGGGCGCCGCCTCCCAGGACCTGGACCGCC^CGGCGCC^TCACCTCCT 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GCCGCCGTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 
TGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGACTGGCA 

GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCATGGACCCCGCCGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTCCCTG 
CTGCACCCCATCTGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGGTGTGGCGCTTCGACTCCTCCCTGGCCTTCA 
AGCACCGCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 
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40. 2003 CON C nef. PEP 



^ y 4 MGGKWSKSSIVGWPAVRERIRRTEPAAEGVGAASQDLDKH^ 

p£ KAA FDLSFFLKEKGGIjEGIjIYSKKRQEIIjDIjWVYHTQGYFPDWQNYTPGPGVRYPLTFGWCFKIiV c 



s 



LLHPMSQHGMEDEDREVLKWKFDSHLARRHMAREIiHPEYYKDC$ 
2003_CON_C nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCGTGCG 

AGGGCGTGGGCGCCGCCTCCCAGGACCTGGACAAGCACGGCGCCCTGACCTCCTCCAACACCGCCACCAACAACGCCGACTG 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATC3ACCTAC 

AAGGCCGCCTTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAG^ 

TCCTGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAAC^rACACCCCCGGCCCCGGCGTGCGCTACCC 

CCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTGC 

CTGCTGCACCCCATGTCCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACTCCCACCTGG 

GCCGCCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

41. 2003_C.anc nef .PEP 

MGGKWSKSSIVGWPAVRERMRRTEPAAEGVGAASQDLDKHGALTSSNT^ 

KAAFDLSFFLKEKGG^GLIYSKKRQEILDLWVYHTQGYFPDW 

LiIjHPMSQHGMEDEDREVIjKWKFDSHIjARRHMARELHPEYYKDC$ 



D 



2003_C.anc nef . OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATGCGCCGCACCGAGCCCGCCGCCG 
AGGGCGTKXSGCGCCGCCTCCC^GGACCTGGACAAGCACGGCGCCC^ 

CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTAC 
AAGGCCGCCTTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCT^ 
TCCTGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCGTGCGCTACCC 
CCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCAACGAGGGCGAGAACAACTGC 
CTGCTGCACCCCATGTCCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACTCCCACCTGGCCC 
GCCGCCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

42. 2003_CON_D nef . PEP 

MGGKWSKSSIVGWPAIRERIRRTEPAADGVGAVSRDLEKHGAZTSSNTAATN^ 
L KAAIiDLSHFLKEKGGIiEGLVWSQKRQEII^LWVYNTQGFFPDWQN^ 
' ! J I^PMCQHGMEDPEREVIjMWRFNSRLAFEHKARVIiHPEFYKDC$ 



2003_CON_D nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCATCCGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCG 
ACGGCGTGGGCGCCGTGTCCCGCGACCTGGAGAAGCACGGCGCCATCACCTCCTCCAACACCGCCGCCACCAACGCCGACTG 
CGCCTGGCTGGAGGCCCAGGAGGAGGACGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTAC 
AAGGCCGCCCTGGACCTGTCCCACTTCCTGAAGGAGAAGG^ 

TCCTGGACCTGTGGGTGTACAACACCCAGGGCTTCTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCC 
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CCTGACCTTCGGCTGGTGCTTCGAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGGAGGCCACCG^GGGCGAGAACAACTGC 
CTGCTGCACCCGATGTGCCAGCACGGCATGGAGGACCCCGAGCGCGAGGTGCTGATGTGGCGCTTCAACTCCCGCCTGGCCT 
TCGAGCACAAGGCCCGCGTGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

li\* 3 * 2003_CON_F1 nef .PEP 

I ' MGGKWSKSSIVGWPAVRERMRPTPPAAEGVGAVSQDIjERRGAITSSNTGATN^ 
A . GAVDLSHFLKEKGGLEGLIYSKKRQEIIiDIjWVYHTQGY 
/ ' LHPMSQHGMEDEDREVLIWKFDSR1iALiRHIARERHPEFYQD$ 



2003_CON_F1 nef . OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTC<^TTCTGG^ 

AGGGCGTQGGCGCCGTGTCCC^GGACCTGGAGCGCCGCGGCGCCATCACCT 

GGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GGCGCCGTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 
TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 
GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGAAGGCCAACGAGGG 

CTGCACCCCATGTCCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGATCTGGAAGTTCGACTCCCGCCTGGCCCTGC 
GCCACATCGCCCGCGAGCGCCACCCCGAGTTCTACCAGGACTAA 

44. 2 003 CON F2 nef . PEP 

A * MGGKWSKSSIVGWPTIRERIRRTPVAAEGVGAVSQDLDKHGAITSSNTRATNADLAWIjEAQEDEEVGFPV^ 
AAFDLSHFLKEKGGLEGIilYSKKRQEIIiDLVATYHTQGYFPDWQNYTPGPGTRYPLTFGWCFKL 
IoHPMSLiHGMEDEDREVIjKWKFDSRLAIjRHIARERHPEYYKD$ 



0 



2003CONF2 nef . OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCACCATCCGCGAGCGCATCCGCCGCACCCCCGTGGCCGCCG 
AGGGCGTGGGCGCCGTGTCCCAGGACCTGGACAAG<^CGGCGCCAT 

GGCCTGGCTGGAGGCCCAGGAGGACGAGGAGGTGGGCTTCCCCGTGCGCCC CCAGGTGCCCCTGCGCCCCATGACCTACAAG 
GCCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 
TGGACCTGTGGGTGTACCAC^CCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCACCCGCTACCCCCT 
GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGAAGGCCAACGAGG^ 

CTGCACCCCATGTCCCTGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACTCCCGCCTGGCCCTGC 
GCCACATCGCCCGCGAGCGCCACCCCGAGTACTACAAGGACTAA 

A Q[f 45. 2003_CONJ3 nef .PEP 

°k ■ MGGKWSKS S I VGWPEVRERI RQTPPAAEGVGAVSQDLARHGAITSSNTAANNPDCAWIjEAQEEDSEVGFPVRPQVPLRPMTY 
V A KGAFDliS F FIiKEKGGLDGI* I YS KKRQD I LDLWVYNTQGFFPDWQlTYTPGPGTRFPIiTFGWCFKIjVPMDPAKVEEANKGENNS 
IiLHPICQHGMEDEDREVLVWRFDSSL.ARRHIARELHPEYYKDC$ 



ft 



2003^CON_G nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGAGGTGCGCGAGCGCATCCGCCAGACCCCCCCCGCCGCCG 
AGGGCGTGGGCGCCGTGTCCCAGGACCTGGCCCGC CACGGCGCCATCACCTCCTC CAACACCGCCGCCAACAACCCCGACTG 
CGCCTGGCTGGAGGCCCAGGAGGAGGACTCCGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTAC 
AAGGGCGCCTTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGGGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGACA 
TCCTGGACCTGTGGGTGTACAACACCCAGGGCTTCTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCACCCGCTTCCC 
CCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCATGGACCCCGCCGAGGTGGAGGAGGCCAACAAGGGC 
CTGCTGCACCCCATCTGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGGTGTGGCGCTTCGACTCCTCCCTGGCCC 
GCCGCCACATCGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

. 2003 CON H nef . PEP 

MGGKWSKSSIGGWPAIRERIRRAEPAAEGVGAVSRDLDRRGAVTINNTAS 
, KGAFDLSHFLKEKGGI^GLIYSK30*QEIIiDI,WVYN^ 

LLHPICQHGMEDEEREVIjMWKFDSRIjAFRHIAREIjHPEFYKDC$ 



3 



2003_CON_H nef. OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGGCGGCTGGCCCGCCATCCGCGAGCGCATCCGCCGCGCCGAGCCCGCCGCCG 
AGGGCGTGGGCGCCGTGTCCCGCGACCTGGACCGCCGCGGCGCCGTGACCATCAACAACACCGCCTCCACCAACCCCGACTC 
CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCnTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGAC 
AAGGGCGCCTTCGACCTX5TCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAGA 
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1 

ft 



TCCTGGACCTGTGGGTGTACAAC^CCCAGGGCTACTTCCCCGAC^ 

CCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCAGGAGGTGGAGAAGGCCAACGAGGGCGAG 

CTGCTGCACCCCATCTGCCAGCACGGCATGGAGGACGAGGAGCGCGAGGTGCTGATGTGGAAGT^ 

TCCGCCACATCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

. 2003 CON 01 AS nef.PEP 

MGGKWSKSS I VGWPQVRERI KQTPPATEGVGAVSQDIjDKHGAVTS SNMNNADCVWIjRAQEEEEVGFPVRPQVPLRPMTYKGA 

FDLSFFLKEKGGlLDGLIYSKKRQEIIjDIiWVYNTQ 

PMS QHGI EDEEREVIiMWKFDS AIiARKHIAREIiHPE YYKDC $ 



" 2003 CON_01_AE nef .OPT 

A ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCCAGGTGCGCGAGCGCATCAAGCAGACCCCCCCCGCCACCG 

lr) AGGGCGTGGGCGCCGTGTCCCAGGACCTGGACAAGCACGGCGCCGTGACCTCCTCCAACATGAACAACG 

GCTGCGCGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGGCGCC 
TTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCCTGGACC 
TGTGGGTGTACAACACCCAGGGCTTCTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCTGTGCTT 
CGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAG 

CCCATGTCCCAGCACGGCATCGAGGACGAGGAGCGCGAGGTGCTGATGTGGAAGTTCGACTCCGCCCTGGCCCGCAAGCACA 
TCGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

'/V W«48. 2 00 3_CON_0 3_AE nef.PEP 

MGGKWSKSS I VGWPQVRERIRRAPAPAARGVGPVSQDLDKYGAVT 
f\ KGAFDLSHFLKEKGGIjDGIjI YSKKRQE II*DI*WVyHTQGYFPDWQNYTPG PGIRFPL 



LIjHP ICQHGMDDEEKEVLMWKFDSRIiAIjTHRAREIjHPEFYKDC $ 
2003_CON_03_AE nef • OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCC^TCGTGGGCTGGCCCGAGGTC^ 
CCCGCGGCGTGGGCCCTCTGTCCCAGGATC^ 

CTGCGCCTK^CTGGAGGCCCAGAAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTAC 
AAGGGCGCCTTCGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGGAGA 
TCCTGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCGGCTTCCC 
CCTGACCTTCGGCTGGTGCTACAAGCTGGTGCCCGTGGACCCCGACGAGGTGGAGGAGGCCACCGAGGGCGAGAACAACTCC 
CTGCIGCACCCCATCTGCGAGCACGGGATGGACGACGAGGAGAAGGAGGTGCTGAT^ 
TGACCCACCGCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

rv 49. 2003_CON_04_CFX nef.PEP 

" 0 MGGKWSKS S I VGW PAI RERMRQRGPAQAE PAAAGVGAVSQDLDKHGAITS SNTAATNPDKAWIjEAQEEEEEVGFPVRPQVPIj 

V h RPWTFKAALDLSHFIjKEKGGIjDGIjIYSKKRQEIIiDIjWVYNTQGYFPDWQNYTPGPGE 

ff GENNCLLHP I S QHGMEDEEREVXiKWKFDSRLiAYKH I ARELHPEF YKDC$ 

2003_CON_04_CFX nef. OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCATCCGCGAGCGCATGCGCCAGCGCGGCCCCGCCCAGG 
CCGAGCCCGCCX3CCGCCGGCGTGGGCGCCGTGTCCCAGGACCTGGACAAGCACGGCGCCATCACCTCCTCCAACACCGCCGC 
CACCAACCCCGACAAGGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGA 

CGCCCCATGACCTTCAAGGCCGCCCTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCA 
AGAAGCGCCAGGAGATCC7IGGACCTGTGGGTGTACAACACCCAGGGCTACTTCCCCGACTGGGAGAACT 

CGGCGAGCGCTTCCC CCTGTGCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCC CCAGGAGGTGGAGGAGGCCACCGAG 
GGCGAGAACAACTGCCTGCTGCACCCCATCTCCCAGCACGGCATGGAGGACGAGGAGCGCGAGGTGCTGAAGTGGAAGTTCG 
ACTCCCGCCTGGCCTACAAGCACATCGCCCGCGAGCTGCACCCCGAGTTCTACAAGGACTGCTAA 

f /^^50. 2003_CON_06_CFX nef.PEP 

MGGKWSKS SI VGW PQVRERMRNPPTEGAAEGVGAVSQDLDKHGAITSSNT ATTN AACAWT^ 

ft YKGAFDLSFFIiKEKGGLDGLIYSKKRQEIIiDLWVYHTQGFPPDW^ 
CLLHPMCQHGVEDEEREVLMWKFDSSLARRHIAREMHPEFYKDC$ 



6°t 



2003 CON_06_CFX nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCCAGGTGCGCGAGCGCATGCGCAACCCC^ 

CCGCCGAGGGCGTGGGCGCCGTGTCCGAGGACCTGGACAAGCACGGCGCCATCACCTCCTCCAACACCGCCACCACCAACGC 

CGCCTGCGCCTGGCTGGAGGCCCAGACCGAGGACGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACC 
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TACAAGGGCGCCTTCGACCTGTCCTTCTTCCTGAAGGAG 

AGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTA 
CCCCCTGACCTTCGGCTGGTGCTACAAGCTGGTGCCCGTGGACCCCAAGGAGGTGGAGGAGGACACCAAGGGCGAGAACA 
TGCCTGCTGCACCCCATGTGCCAGCACGGCGTGGAGGACGAG 
CCCGCCGCCACATCGCCCGCGAGATGCACCCCGAGTTCTACAAGGACTGCTAA 



6^.1^51. 200 3 CON 0 8 BC no f .PEP 

ft 



MGOKWSKSS IVGWPAIRERIRRTEPAADGVGAVSRDIiEKHGAITS SNTADTNADCAWIjETQEBEEVGFPVRPQVPIjRPMTFK 
GALDLS FFLKEKGGIiEGLI YS KKRQE I LDLWVYHTQG YFPD WHNYTPGPGVRFPLTFGWC FKLVP VDPREVEEANEGEDNCL 
IjHPVCQHGMEDEHREVLKWKFDSQIjAHRHRARELiHPEFYKDC$ 

2003_CON_08_BC ne£ .OPT 

A aTGGGCGGCAAGTGGTCC^AGTCCTCCATCGTGGGCTGGCCCGCCATCCGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCG 

r) acggcgtgggcgccgtgtcccgcgacctggagaagc^cggcgccat 

cgcctggctggagacccaggaggaggaggaggtgggcttccccgtgcgcccccaggtgcccctgcgccccatgaccttcaa 

ggcgccctkksacctgtccttcttccto 

tggacctgtgggtgtaccacacccagggctacotccccgactggcacaactacaccccc^ 
gaccttcggci^tgcttcaagctggtgcccgtggaccccc^ 
ctgcaccccgtgtgccagc^cggc^tggaggacgagcaccgcgaggtgctgaagtgg 
gccaccgcgcccgcgagctgcaccccgagttctacaaggactgctaa 

A (0^52. 2003 CON1 0 CD nef .PEP 

. ^ ' MGGKWSKS S IVGWPAVRERI RRTDPAAEGVGAASRDIjEKYGAITS SNTAQTNPDCAWIjEAQEEEEEVGFPVRPQVPIjRPMTY 

y ^' kgafdlsfflkekgglegliyskrrqdildl.wvyot^ 

LLHPMSLHGMKDPHGEVIjMWKFDSNLAHK3IMARELHPEYYKDC$ 



ft 



200 3_CON_10_CD nef . OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCGACCCCGCCGCCG 
AGGGCGTGGGCGCCGCCTCCCGCGACCTGGAGAAGTACGGCGCCATCACCTCCTCCAACACCGCCCAGACC7^AGCCCGACTG 
CGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTC 

AAGGGCGCCTTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGCGCCGCCAGGACA 
TCCTGGACCTGTGGGTGTACAACACCCAGGGCTTCTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCC 
CCTGACCTTCGGCTGGTGCTACAAGCTGGTGCCCGTGGACCCCC 

CTGCTGCACCCCATGTCCCTGCACGGCATGGAGGACCCCCACGGCGAGGTGCTGATGTGGAAGTTCGACTCCAACCTGGCCC 
ACAAGCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCTAA 

53. 2003_CON_11_CFX nef .PEP 

MGGKWSKSSIVGWPEIRBRLRRTPPTAAAEGVGAVSKDLEKHGAVTSSN^ 
YKGA^LGFFLKEKGGI^GIiIYSKKRQEILDLWVYHTQGYFPDW 
CLIjHPMSQHGMDDEEREV1 j MWKFDSSI*ARRHIAREIjHPDFYKDC$ 

2003_CON_11_CFX nef .OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTG 

CCGCCGAGGGCGTGGGCGCCGTGTCC^^GGACCTGGAGAAGCACGGCGCCGT^^ 

CGCCTGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACC 
TACT^GGGCGCCTTCGACCTGGGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAAGAAGCGCCAGG 
AGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTA 
CCCCCTGTGCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGAGCCCCGCGAGGTGGAGGAGGCCAACGAGGGCGAGAACAAC 
TGCCTGCTGCACCCCATGTCCCAGCACGGCATGGACGACGAGGAGCGCGAGGTGCTGATGTGGAAGTTCGACTCCTCC 
CCCGCCGCCACATCGCCCGCGAGCTGCACCCCGACTTCTACAAGGACTGCTAA 

\t& 54 * 200 3_CON_12_BF nef . PEP 
rll * MGGKWSKSSIVGWPDIRERMRRAPPAAEGVGAVSQDLENRGAITSSNTRANNPDLAWIiEAQEEEEV^ 
" GALDLSHFLKEKGGLiEGIiX YSKKRQEII^LVA/YHTQGYFPDWQNYTPGPGIRYPLTFGWCFKLVPVDPEEVEKANEGENNCIi 

LHPM S QHGMEDEDREVLMWKFDS RLAIjRH I ARE KH PE F YQD C $ 

2003_CON_12_BF nef. OPT 

ATGGGCGGCAAGTGGTCCAAGTCCTCCATCGTGGGCTGGCCCGACATCCGCGAGCGCATGCGCCGCGCCCCCCCCGCCGCCG 
AGGGCGTGGGCGCCGTGTCCCAGGACCTGGAGAACCGCGGCGCCATCACCTCCTCCAACACCCGCGCCAACAACCCCGACCT 



ft 



ft 

& 
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GGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAG 

GGCGCCCTGGACCTGTCCCACTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACTCCAAGAAGCGCCAGGAGATCC 

TGGACCTGTGGGTGTACCACACCCAGGGCTACTTCCCCGACTGGCAGAACTACACCCCCGGCCCCGGCATCCGCTACCCCCT 

GACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCGAGGAGGTGGAGAAGGCCAACGAGGGCGAGA 

CTGCACCCCATGTCCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGATGTGGAAGTTCGACTCCCGCCTGGCCCTGC 

GCCACATCGCCCGCGAGAAGCACCCCGAGTTCTACCAGGACTGCTAA 

2003 CON 14 BG nef .PEP 

MGGKWS KCS I VGWPETVRER IRRTPPAAVGVGAVSQDLAKHGAI TS SNTAANNPDCAWLiEAQEED S EVGFP VRPQVPLRPMTY 
KGAFDLSFFLKEKGGLDGIjIYSKQRQDIIjDLWVYNTQGFFPDWQNYT^ 
" LLHPICQHGMEDADNEVIjIWRFDSSIiARRHIAREIiHPDFYKDC$ 

2003_CON_14_BG nef • OPT 

ATGGGCGGCAAGTGGTCCAAGTGCTCCATCGTGGGCTGGCCCGAGGTGCGCGAGCGCATCCGCCGCACCCCCCCCGCCGCCG 
TGGGCGTGGGCGCCGTGTCCCAGGACCTGGCCAAGCACGGCGCCATCACCTCCTCC^ 

CGCCTGGCTGGAGGCCCAGGAGGAGGACTCCGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTAC 

AAGGGCGCCTTCGACCTGTCCTTCTTCCTGAAGGAGAAGGGCGGCCTGGACGGCCTGATCTACTCCAA 

TCCTGGACCTGTGGGTGTACAACACCCAGGGCTTCTTCCCCGACTGGC^ 

CCTGACCTTCGGCTGGTGCTTCAAGCTGGAGCCCGTGGACCCCGCCGAGGTGGAGGAGGCCACCAAGGGCGAGAACAACTCC 

CTTOCTGCACCCCATCTGCCAGCACGGCATGGAGGACGCCGACAACGAGGTGCTGATCTGGCGCT^ 

GCCGCCACATCGCCCGCGAGCTGCACCCCGACTTCTACAAGGACTGCTAA 
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61. 2003_2003_CON_S pol.PEP 

FFRENIJ^QQGEAREFSSEQTRANSPTSRELRVRGGDNPLSEAGAER 

ADDTVLEEINIjPGKWKPKMIGGIGGFIKVRQYDQILIEICGKKM 

KLKPGMDGPKVKQWPI/raEKIKAI/TEICTEMEKEGKIS 

QI pHPAGIiKKKKSVTVLiDVGD AYFSVPLDEDFRKYTAFTIPS INNETPGIRYQYNVIjPQGWKGSPAI FQSSMTKI LEPFRTQ 

NPEIVIYQYMDDLYVGSDUEIGQHRTKIEELREHLLRWGFT^ 

IQKLVGKLNWASQIYPGIKVKQLCKLLRGAKAL^ 

YQIYQEPFKNLKTGKYAKMRSAHTNDVKQLTEAVQKIAT^ 

PLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTO^ 

LGIIQAQPDKSESELVNQIIEQLIKKEKVYIjSVrVPAHK^ 

DFNLPP I VAKEIVASCDKCQIjKGEAMHGQVDCSPGIWQIjDCTHIiEGKI ilvavhvasgy ieaevi paetgqetayf ilklag 
RWPVKVIHTDNGSNFTSAAVKAACWWAGIQQEFGIPYNPQSQGVVESMNKEIiKKIIGQVRDQAEm 
GGIGGYSAGERIIDIIATDIQTKELQKQITKIQNFRVYYRX)SRDPIWKGPAKLIjVnCGEGAVVIQDNSEIKV^ 
YGKQMAGDDCVAGRQDED$ 

2003_CON_S pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCGAGTTCT 

CCCGCGAGCTGCGCGTGCGCGGCGGCGACAACCCCCTGTCCX^GGCCG<3CGCCGAGCGCCAGGGCACCGTGTCCCTGTCCTT 
CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACCGGC 
GCCGAC^ACACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 
AGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 
GAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGC^CCCIXSAACTTCC 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 
GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACAC C CCCATCTTCG CCATCAAGAA 
GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAAC 

GGCATCCCCCACCCCGCCGGCCrrGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 
TGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAA 
CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCACCCAG 
AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCAAGA 
TCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGG 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGAGAAGGACTCCTGGACCGTGAACGAC 
ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGGATCAAGG'^ 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGGACC 
TACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACG 

AGCAGCTGACCGAGGCCGTGCAGAAGATCGCGACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCAT 
CCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 

GCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAACCA 

GAAGACCGAGCTGGAGGCCATCCACCTGGCCCTGGAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 

CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCACCGGCATCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACTCCAACTGGCGCGCCATGGCCTC 

GACI^CAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAA 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGATCATCCTX3GTGGCCGTGCACG 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCrrACTTCATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGATCC^CACCGACAACGGCTCCAACT^ 

GCATCCAGGAGGAGTTCGGCATCCCCTACAACCCCCAGTCCGAGGGCGTGGTGGAGT 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGCAAG 
GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGAAGC 
AGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCT^AGCTGCTGTG 
GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGAC 
TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 



62 2003_M GROUP anc pol.PEP 

FFRENLAFQC^EAREFSSEQTRANSPTSRELRVRGGDITCLSEAGAERQ 

ADDTVTjEEINLPGKWKJPKMIGGIGGFIK^/RQYDQIL 

KLKPGMDGPK\nCQWPLTEEKIKALTEICTEMEKEGKI^ 
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GIPHPAGLKKIGCSVTVIjDVGDAYFSVPLiDEDFRKYTAFTIPS INNETPGIRYQYNVIiPQGWKGSPAIFQSSMTKIIjEPFRTK 

NPEIVIYQYMDDLYVGSDLEIGQHRAKIEEI#REHIjI.RWG 

IQKLVGKLNWASQIYPGIKVKQLCKLL^^ 

YQIYQEPFKNLKTGKYAKMRSAHTNDVKQ^ 

pLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTDRGRQKVVre 

LGIIQAQPDKSBSELVNQIIEQIiIKKEKVYMWVPAHKG 
. D FNIiPP WAKE I VAS CDKCQLKGEAMHGQVDC S PGI WQIJDCTHIiEGKVI LVAVHVASGYI EAEVI PAETGQETAYF I IiKLiAG 
" RWPVKVIHTDNGSNFTSAAVKAACWWAGIQQEFGI PYNPQSQGVVESMNKELiKKI IGQVRDQAEHIjKTAVQMAVFIHNFKRK 

GGIGGYSAGERI ID 1 1 ATDIQTKELQKQITKI QNFRVYYRDSRDP I WKGPAKLLWKGEGAWIQDNSE IKWPRRKAKIIRD 

YGKQMAGDDCVAGRQDED$ 

200 3_M . GROUP anc pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCGAGTTCTCCTCCGAGCAGACCCGCGCCAACrcCCCCACC^ 
CCCGCGAGCTGCGCGTGCGCGGCGGCGACAACCCCCTGTCCGAGGCCGGCGCCGAGCGCCAGGGCACCGTGTCCTTCTCCTT 
CCCCOIGATCACCCTGTGGCAGCGCCCCCTO 

GCCGACGACACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 

AGGTGCGCCAGTACGACCAGATCCTGATCX3AGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTO 

GAACATCATCGGCCGCAAGATGCTGACCGAGATCGGCTGCACCCTGAACTTCCCCATCTCCC 

AAGCTGAAGCC CGGCATGGACGGCC CCAAGGTGAAGCAGTGGC CCCTGAC CGAGGAGAAGATCAAGGC CCTGACCG AGATCT 
GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAA 
GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTC^ 

GGCATCC CCCACCCCG CCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTXKMCGACGCCTACTTCTCCGTGCCCC 
TGGACGAGGACTTCCGCAAGTACACCGCCTrTCACCATCCCCTCCATCAACAACGAGAC 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCX5CACCAAG 

AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGA 

TCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTG 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGAGAAGGACTCCTGGACCGTGAACGAC 

ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGAAGCAGCTGTGCAAGCT 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 

GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGGACC 

TACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGGAAGTACGCCAAGATGCGCTCCGCCCACACCAA 

AGCAGCTGACCGAGGCCGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCAT 

CCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 

GCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAACCA 

GAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 

CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCAC^AGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTC 

CAAGGTGCTGTTCCTGGACGGCTVTCGACAAGGCCCAGGAGGAGCACGAGAAGTACCAC^ 

GACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGCACG 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGGTGATCCTGGTGGCCGTGCACGT 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTGATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCG 

GCATCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTC(^TGAACAAGGAGCTGAAGAAGAT 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAA.CTTCAAGCGCAAG 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGA^ 

AGATGACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 

GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGAC 

TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

1 63- 2003_CON_A1 pol.PEP 

FFRENIjAFQQGEARKFSSEQTGANSPTSRDLWDGGRDSLPSEAGAERQGTGPTFSFPQI 
GADDTVLEDINIiPGKWKPKMIGGIGGFIKVKQYDQILIEICGKKAIGTVLVGPTPVN 
VKLKPGMDGPKVKQWPLTEEKIKALTEICTEMEKEGKISKIGPENPYNTPIFAIKKKDST 
IX3IPHPAGLKKKKSVT\n^DVGDAYFSVPI^ESFRKYTAFTIPSTNNETPGIRYQYNV^ 

KNPEI I iYQYMDDLYVGSDLEIGQHRTKIEEIjRAHIjIjSWGFTTPDKKHQKEPPFIiWMGYEIjHPDKWTVQPIEL 
DIQKLVGKLNWASQIYAGIKVKQIXTKliljRGAlb^^ 

TYQIYQEPFKNLKTGKYARKRSAHTNDVKQLAEWQKVVMESIVIWGKTPKFIOjPI 

pplvki*wyqlekdpivgaetfyvdgaanretki^kagyvto 
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AI/GIIQAQPDRSESELVNQIIEKIjIGKDKVYIjSWVPAHKGIGG^ . 

SDFNLPPIVAK^IVASCTKCQLKGEAMHGQVDCSPGIWQIJDCTHLEGKVILVAVHVAS 

GRWPVKVVHTDNGSNFTSAAVKAACWVJANIQQEFGIPYNPQSQGVVESMNKEIjKKIIGQVREQA 

KGGIGGYSAGERIIDIIATDIQTKEI^KQITKIQNFRVYTCDSRM 

DYGKQMAGDDCVAGRQDED $ 

2003_CON_A1 pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCTCCTCCGAGCAGACCGGCGCCAACTCCCCCACCT 

CCCGCGACCTGTGGGACGGCGGCCGCGACTCCCTGCCCTCCGAGGCCGGCGCCGAGCGCCAGGGCACCGGCCCCACCTTCTC 

CTTCCCCCAGATCACCCTGTGGGAGCGCCCCCTGGTGACCGTGCGCATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGA 

GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGC 

TCAAGGTGAAGCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 

CGTGAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGCACCCTGAACTT 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 
TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTAG 

GAAGAAGGACTCCACCAAGTGGCX5GAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGG 
CTGGGC^TCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTAC^ 
CCCTGGACGAGTCCTTCCGC^GTACACCGCCTTCACCATC^ 

CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACC^ 

AAGAACCCCGAGATCATGATCTACGAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 
AGATCGAGGAGCTGCGCGCCGACCTGCTGTCCTGGGGCTTCACCACCCCCGACAAGAAGCACGAGAAGGAGCCCCCCTTCCT 
GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGTCCTGGACCGTGAAC 
GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGC 
TGCGCGGCGCCAAGGCCCTGACCGACATCGTGACCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 
GAAGGACCCCGTGCACX5GCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGG 
ACCTACCAGATCTACCZAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCAAGCGCTCCGCCCACACCAACGACG 
TGAAGCAGCTGGCCGAGGTGGTGCAGAAGGTGGTGATGGAGTCCATCGTGATCTGGGG 

CATCCAGAAGGAGACCTGGGAGACCTGGTGGATGGACTACTGGCAGGCC^CCTGGATCCCCGAGTGGGAGTTCGTGAACACC 
CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGACCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 
ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAA 
CC^GAAGACCGAGCTGCACGCCATCCACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTAC 
GCCCTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGAAGCTGATCGGCAAGG 
ACAAGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCAT 
CCGCAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACTCCAACTGGCGCGCCATGGCC 
TCCGACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCGACCTGGAGGGCAAGGTGATCCTGGTGGCCGTGCA 
CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGGTGGTGCACACCGACAACGGCTCCAACrrTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGG 
CCAACATCCAGCAGGAGTTCGGCATCCCCTACAA^ 

GATCATCGGCGAGGTGCGCGAGCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTC 

AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGGATCATCGAGATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGA 

AGC^GATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCT 

GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGAGAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGG 

GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

64. 2003_A1 . anc pol.PEP 

FFRENIiAFQQGEARKFSSEQTRANSPTSRJSLWDGGITOSLLSEAG 
GADDTVLEDINLPGKWKPKMIGGIGGFIKVRQYDQILIEICGK3CAIGTVIjVGPT 

VKLKPGMDGPKVKQW PLTEEKI KALTEI CTEMEKEGKI SKIGPENPYNTPVFAI KKKDSTKWRKLVDFREIiNKRTQDFWEVQ 
LG I PHPAGLKKKKS VTVLDVGDAY FS VPLDESFRKYTAFTI PS INNETPGI RYQYNVI*PQGWKGS PAI FQSSMTKIIiEPFRS 
KNPE I VI YQYMDDL YVGSDLE IGQHRAKIEELRAHIiLSWGFTTPDKKHQKEPPFLV^ 
DIQKLVGKLNWASQIYAGIKVKQLCKLL^^ 

TYQIYQEPFKNLKTGKYAKKRSAHTNDVKQLTEVVQKVATESIVI 

PPIiVKLWYQLEKEPIAGAETFYVDGAANRETKIiGKAGYVTDRGRQKVVSLTETTO 

AIXjIIQAQPDRSESELVNQIIEKIjIEKEKVYLSWVPAHKGIGGNEQV^ 

SD FNLPP I VAKE I VAS CDKCQLKGEAMHGQVDCS PGI WQLDCTHLEGKVI LVAVHVASGY IEAEV I PAETGQETAYFLIjKIiA 

grwpvkvvhtdngsnftsaavtcaacwwaniqqefgi^ 

KGGIGGYSAGERI IDI IATDIQTKELQKQITKIQNFRVYYTU3SRDPIWKGPAX1jLWKGEGAVVIQDNSDIKV\^RRKAKI ir 

dygkqmagddcvagrqded$ 
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2 003_Al.anc pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCTCCTCCGAGCAGACCCGCGCCAACTCCCCCACCT 
CCCGCGAGCTGTGGGACGGCGGCCGCGACTCCCTGCTGTCCGAGGCCGGCGCCGAGCGCCAGGGCACCGTGCCCTCCTTCTC 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 
GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 
TCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 
CGTGAACATCATCGGCCGCAACATX3CTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 
TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAA 
GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAG 
CTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 
CCCTGGACGAGTCCTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTA 
CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCTCC 
AAGAACCCCGAGATCGTGATCTACGAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCA 
AGATCGAGGAGCTGCGCGCCCACCTGCTGTCCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCT 
GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCAAGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 
GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGC 
TGCGCGGCGCCAAGGCCCTGACCGACATCGTGACCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 
GAAGGAC CCCGTGCACGG CGTGTACTACGACCCCTCCAAGGACCTGGTGGCCGAGATCC AGAAGCAGGGCCAGGACCAGTGG 
ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGAAGCGCTCCGCCCACACCAACGACG 
TGAAGCAGCTGACCGAGGTGGTGCAGAAGGTGGCCACCGAGTCCATCGTGATCTGGGGGAAGACCCC 

CATCCAGAAGGAGACCTGGGAGACCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACAC^ 
CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGCCX3GCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 
ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGC.CGCCAGAAGG7*GGTGTCCCTGACCGAGACCACCAA 
CCAGAAGACCGAGCTGCACGCCATCCACCTGGCCCTGGAGGACTCCGGCTCCGAGGTG^ 

GCCCTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGAAGCTGATCGAGAAGG 

AGAAGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTG 

CCGCAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACTCCAACTGGCGCGCCATGGCC 

TCCGACTTCAACCTGCCCCCCATCX3TGGCCAAGGAGATCGTGGCCTCCTGCGA 

ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGG 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGGTGGTGCACACCGACAACGGCTCCAACTTCACCTCCGCCG 

CCAAC ATCCAG CAGGAGTTCGG CATCCCCTACAACCCCCAGTCC CAGGGCGTGGTGGAGTCCATGAACAAGG AGCTGAAGAA 
GATCATCGGCCAGGTGCGCGAGCAGGCCGAGCACCTC 

AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCGAGACGAAGGAGCTGCAGA 
AGCAGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCGAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

^65. 2003_CON_A2 pol.PEP 

FFRKNIiAFQQREARKFSSEQNRANSPTSRELRNGGRDNLIiSEAGAEEQGTVH 

GADDTVLEDINLPGKWK^KMIGGIGGFIKVRQYDQIAIEira 

VKLKPGMDGPKNTCQWPLTEEKIKAIiTEICKEMEKEGKISKIGPENPT^^ 

LGIPHPAGLIOaCKSVTVLDVGDAYFSVPLHEDFRKYTAFTIPSINNETPGlRYQYl^^ 

KNPEMVIYQYMDDLYVGSDIjEIGQHRAKIEE 

DIOKLVGKLNWASQI YAGI KVKQUCKIjIiRGTKAIjTD I VTIiTKEAELELEENRE ILKNPVHGVYYDPSKDL I AEIQKQGQDQW 

TYQIYQEPFKNIiKTGKYAKRKSTHTNDVKQLTEAVQKIAIESIVIWGKTPKFRIiPIQKET^ 

PPLVKLWYQLETEPIAGAETFYVDGAANRETKXK3KAGYVTO 

ALGI IQAQPDRSESELVNQI IEKLIEKERVYIiSWPAHKGIGGNEQVDK^^ 

HDFNI*P P I VAKE I VAS CDKCQIjKGEAMHGQVDCS PG I WQLDCTHLiEGKVI IiVAVHVASGY I EAEVX PAETGQETAYF I LKLiA 
GRWPVKVIHTDNGPNFTSAT\nCAACWWAGVQQEFGIPYNPQSQGVVESMNKELKKI IGQVRDQAEHLKTAVQMAVF IHNFKR 
KGG XGG YS AGERX ID 1 1 ATD I QTKELQKQ 1 1 KI QNFRVYYRDSRDP I WKGPAKLIjWKGEGAWT QDNSD X KWPRRKAKI I R 
DYGKQMAGDDCVAGRQDED $ 

2003 CONA2 pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGCGCGAGGCCCGCAAGTTCTCCTCCGAGCAGAACCGCGCCAACTCCCCCACCT 

CCCGCGAGCTGCGCAACGGCGGCCGCGAGAACCTGCTGTCCGAGGCCGGCGCCGAGGAGCAGGGCACCGTGC^ 

CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGAGGGCCAGCTGCGCGAGGCCCTGCTGGACACC 



24 



GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 
TCAAGGTGCGCCAGTACGACCAGATCGCCATCGAGATCTGCGGCAAGCGCGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 
CGTGAACATCATCGGCCGCAACATGCTGGTGC^GCTGGGCTGCACCCTGAACTTCCCCATCT 

GTGAAGCTGAAG CCCGGCATGGACGGCC CCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 

TCTGCAAGGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACC^ 

GAAGAAGGACTCC^CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGC 

CTGGGCA.TCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 
CCCTOC^CGAGGACTTCCXSCAAGTAC^CCGCCTTC^^ 

CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCC 

AAGAACCCCGAGATGGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCA 

AGATCGAGGAGCTGCGCGCCCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCT 

GTGGATGGGCTACX3AGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCAAGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGC^ 

TGCGCGGCACCAAGGCCCTGACCGACATCGTGACCCTGACCAAGGAGGCCGAGCTGGAGCTGGAGGAGAACCGCGAGATCCT 

GAAGAACCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGG 

ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGCGCAAGTCCACCCACACCAACGACG 

TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTG^ 

CATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCC^ 

CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGACCGAGCCCATCGCCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 

ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGATCGTGTCCCTGACCGAGACC^CCAA 

CCAGAAGACCGAGCTGCACGCCATCTACCTGGCCCTGCAGGACTCCGGCCTGGAGGTGAACATCGTGACCGACTCCCAGTAC 

GCCCTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGAAGCTGATCGAGAAGG 

AGCGCGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCAT 

CCGCAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACTCCAACTGGCGCGCCATGGCC 

CACGACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 

ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGGTGATCCTGGTGGCCGTGCA 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 

GGCCGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCCCCAACTT 

CCGGCGTGCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCC 

GATCATCGGCCAGGTGCGCGACCAGGCCGAGO&CCTGAAGACC^ 

AAGGGCGGGATCGGCGGCTACTCCGCCGGCGAGCGGATCATCGAGATCATCGCCACC^ 

AGCAGATCATCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

66. 2003_CON_B pol.PBP 

FFREDLAFPQGKAREFSSEQTRANSPTRRELQWGRDNNSLSEAGADR^ 
GADDTVLEEMNI^PGRWKPKMIGGIGGFIKVRQYDQIIjIEICGHKAIGTVIjVGPTP 

VKLKPGMDGPKVKQWPLiTEEKI KALVE ICTEMEKEGKI SKIGPENP YNTPVFAI KKKDSTKWRKLVDFREIjNKRTQDFWEVQ 

LGIPHPAGLKKKKSVTVIjDVGDAYFSVPLiDKDFRKYTAFTIPSINNET 

QNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHI^RWGFTTPDKKH^ 

DIQKLVGKLNWASQ IYAGI KVKQLCKLLRGTKAL.TEVI PLTBEAELELAENRE I LKEPVHGVYYDPSKDL. I AE IQKQGQGQW 

TYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKIATESIVIWG 

PPLVKLWYQLEKEPIVGAETFYVIX3AANRETKLGKAGY 

AliGI IQAQPDKSESELVSQI IEQLIKKEK\TYIjAWVPAHKGIGGNEQVDKLVSAGIRKVLFL^ 

SDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQIjDCTHIjEGKI ILVAVHVASGYIEAEVI PAETGQETAYFLLiKXiA 
GRW P VKT I HTDNG SN FTS TTVKAAC WWAG I KQE FG I P YNPQS QG WE S MNKE LKKI I GQVRD Q AEHLKTAVQMAVF I HNFKR 
KGGIGGYSAGERIVDI XATDIQTKELQKQITKIQNFRVYYTUDSRDPIiWKGPAKIjLiWKGEGAVVIQDNSDI kwprrkaki ir 
D YGKQMAGDDCVAS RQD ED $ 

2003_CON_B pol.OPT 

TTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGGAAGGCCCGCGAGTTCTC 

GCCGCGAGCTGCAGGTGTGGGGCCGCGACAACAACTCCCTGTCCGAGGCCGGCGCCGACCGCCAGGGCACCGTGTCCTTCTC 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCATCAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 
GG CGCCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCCGCTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 
TCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 
CGTGAACATCATCX^CCGCAACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCC 
GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGGTGGAGA 
TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACAC 
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GAAGJUVGGACTCCACCAAGTGGCGCAAGCTGGTGGACTT 

• CTGGG CATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 
CCCTGGACAAGGACTTCCGC^GTACACCGCCTTCACCATC 

CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCyVTGACCAAGATCCTGGAGCCCTTCCGCAAG 
CAGAACCCCGACATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 
AGATCGAGGAGGTGCGCCAGGACCTGCTGCGCTGGGGCTTCACGACCCC 

GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGTGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCC^CCCAGATCTACGCCGGC^TCAAGGTGAAGCAGCTGTG 

TGCGCGGCACCAAGGCCCTGACCGAGGTGATCCCCCTGACCGAGGAGGCCGAGCTX3GAGCTGGCCGAGAACCGCGAGATCCT 

GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGG 

ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACG 

TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCAAGCTGCC 

C^TCCAGAAGGAGACCTGGGAGGCCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 

CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 

ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGACACCACCAA 

CCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCC 

GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGTCCCAGATCATCGAGCAGCTGATCAAGAAG^ 

AGAAGGTGTACCTGGCCTGGGTGCCCGCCC^CAAGGGCATCGGCGGCAACGAGCAGGTGG^ 

CCGCAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACTCCAAC^ 

TCCGACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTC 

CGTGGCCTTCCGGCTACATCGAGGCCX^GGTGATCCCCGCC^AGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCC 
GGCCGCTGG CCCGTGAAGAC CATCCACACCGACAACGGCTCCAACTTCACCTCCACCACCGTGAAGGCCGCCTGCTGGTGGG 
CCGGCATCAAGCAGGAGTTCGGCATCCCCTACAACCCCC^.GTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGC 
AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCGTGGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGA 

AGCAGATCACCAAGATCCAGAACTTCC^CGTGTACTACCGCGACTCCC 

GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCTCCCGCCAGGACGAGGACTAA 

67. 2003_B.anc pol.PEP 

FFRENIAFPQGKAREFSSEQTRANSPTRRELQVWGRDNNPI^ 
GADDTVLEEMNLPGKWKPKMIGGIGGFIKVRQYDQILIEICGHKAI 
VKLKPGMDGPKVKQWPI/TEEKIKALVEICTEMEKEGK^^ 
IXSIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRK^ 
QOTEIVIYQYMDDLYVGSDLEIGQHRTKIEEL^^ 
DIQKXjVGKLNWASQIYAGIK^/TCQIjCKXiLRGTKAIjTEVVPIjTEEAE 
TYQIYQEPFKNLKTTGKYARMRGAHTNBVKQLTEAVQKIATESIVIWGKTPKFKLPI 
PPLVKLWQI^KEPIVGAETFYVDGAANRETKLGKAGYVTDRGRQKWSLTDTTO 
AIX3IIQAQPDKSESEIiVSQIIEQLIK3CEKVYIiAWVPAHKGIGGNEQVDKLVSAGIRKVL 
SDFNLPPWAKEIVASCDKCQLKGEAMHGQVDCSPGIWQx^^ 

GRWPVKVIHTDNGSNFTSTTVKAACWWAGIKQEFGI PYNPQSQGWESMNKEIjKKI IGQVRDQAEHLKTAVQMAVFIHNFKR 
KGG IGGYS AGERI VD 1 1 ATD I QTKELiQKQI TKI QNFRVYYRDSRDPIjWKG PAKLIjWKGEGAWI QDNSD I KWPRRKAKI I R 
DYGKQMAGDDCVASRQDED$ 



kSUUJ o . anc pox . wri ^.^^ 

TTCTTCCGCGAGAACCTGGCCTTCCCCC^GGGCAAGKSCCCGCGAGTTCTCCTCCGAGCAGACCCGCGCCAACT 

GCCGCGAGCTGCAGGTGTGGGGCCGCGACAACAACCCCCTGTCCGAGGCCGGCGCCGACCGCCAGGGCACCGTGTCCTTCTC 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCAT 

GGCGCCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATC 
TCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCCACAAGGCCATC^ 

CGTGAACATCATCGGCCGCAACCTGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCC7VTCGAGACCGTGCCC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGGTGGAGA 

TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAAGACCCCCXSTGTTC^ 

GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTG 

CTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 

CCCTGGACAAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTA 

CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTT 

CAGAACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 



26 



AGATCGAGGAGCTGCGCGAGCACCTGCroCGCTGGGG^ 

GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGTGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 
GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACGCCC^ 

TGCGCGGCACCAAGGCCCTGACCGAGGTGGTGCCCCTX3ACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 
GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGG 
ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCATGCGCGGCGCCCACACCAACGACG 
TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCGACCGAGTCCATCGTGATCT 

CATCCAGAAGGAGACCTGGGAGGCCTGGTGGACCGAGTACTGGCAGGCCACCroGATCCCCGAGTGGGAGTTCGTGAACACC 
CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 
ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCGAGAAGGTG 

CCAGAAGACCGAGCTGCAGGCCATCGACCTGGCCCTGCAGGACTCCGGCCTGGAGGTGAACATCGTGACCGACTCCCAGTAC 
GCCCTGGGCATCATCC^GGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGTCCCAGATCATCGAGCAGCTGATCAAGAAGG 
AGAAGGTGTACCTGGCCTGGGTGCCCGCCGACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCGCCGGCAT 
CCGCAAGGTGCTGTTCCTGGACGGCATCGAC^^GGCCCAGGAGGAGCACGAGAAGTACCACTCCAA 

TCCGACTTCAACCTGCCCCCCGTGGTGGCCJAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAA 

C^TGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCACCACCGTGAAGGCCGCCTGCTGGTGGG 
CCGGCATCAAGCAGGAGTTCGGCATCCCCTACAACCCCC^GTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
. GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGC 
AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCGTGGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGA 
AGGAGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCCTGTGGAAGGGCCCCGCCAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCTCCCX3CCAGGACGAGGACTAA 

68. 2003_CON_C pol.PEP 

FFRENLAFPQGEAREFPSEQTRANSPTSRELQVRGDNPRSEAGAERQGTIiN^ 
T\nUEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEI 

PGMDGPKVKQWPLTEEKIKALTAICEEMEKEGKITKIGPENPYNTPVFAIKKKDSTKWRKL 

HPAGIiKKKKSVTVLDVGDAYFSVPIiDEGFRKYTAFTI PS INNETPGIRYQYNVLPQGWKGSPAIFQSSMTKIIiEPFRAQNPE 
IVIYQYMDDLYVGSDLEIGQHRAKIEELREHLLKW^^ 

LiVGKLiNW ASQ I Y PG I KVRQLCKXiLRG AKALTDI V^PLTERAELELiAENRE I LKE PVHGVYYDPS KDLi I AE I QKQGHDQWTYQ I 
YQEPFKNLKTGKYAKMRTAHTNDVKQIjTEAVQKIAMES IVIWGKTPKFRLiPIQKETWETWWTDYWQATWI PEWEFVNTPPLV 
KLWYQI,EKEPIAGAETFYVDGAANRETKIGKAGYVTDRGRQKIVSLTETO 

IQAQPDKSESEIjVNQIIEQLIKKERVYIjSWVPAHKGIGGNEQVDKLVSSGIRKVLFIjDGIDKAQEEHEKYH 
LPPIVAKEIVASCDKCQLKGEIAIHGQVDCSPGIWQIjDCTHIjEGKIIIjVAVHVASGYIEIAEVIPAETGQETAYY 

vkvihtdngsnftsaavkaacwvjagiqqefgipynpqsq 

GGYS AGERI ID 1 I ATDIQTKELQKQI I KIQNFRVYYRDSRDP I WKGPAKLIiWKGEGAWI QDNSD I KWPRRKAKI IKDYGK 
QMAGADCVAGRQDED $ 

20 03_CON_C pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCCCCAGGGCGAGGCCCGCGAGTTCCC 

CCCGCX^GCTGCAGGTGCGCGGCGACAACCCCCGCTCCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGAT 
CACCCTOTGGCAGCGCCCCCTGGTGTCCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGACACCGGCGCCGACGAC 
ACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCAT 

AGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCGACCCCCGTGAAC^TCAT 
CGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCGTGAAGCTGAAG 
CCCGGC^TGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGA 
TGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACT 
CACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCT^ 
CACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCCTGGACGAGG 
GCTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCC 
CCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCGCCCAGAACCCCGAG 
ATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGC 
TGCGCGAGCACCTGCTGAAGTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGGCT 
CGAGCTGGACCCCGAC7VAGTGGACCGTGCAGCCCATCCAGCTGCCCGAGAAGGACTCCTGGACCGTGAACGACATCCAGAAG 
CTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCA 
AGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAAGGAGCCCGT 
GCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCACGACCAGTGOACCTACCAGATC 
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TACCAGGA£5CCCTTCAAGAACCTGAAGACCGG 
CCGAGGCCGTGCAGAAGATCGCC^TGGAGTCCATCGTGATe^^ 

GACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTG 
AAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGCCGGCGCCGAGACCTTCTACX3TGGA 

AGATCGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGATCGTGTCCCTGACCGAGACCACCAACCAGAAGACCGA 

GCTGCAGGCCATCCAGCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGAC^CCCAGTACGCCCTGGGC 

ATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGCGCGTGTACC 

TGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGC^GGTGGACAAGCTGGTGTCCTCCG 

GTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACTCCAACTGGCGCGCC^ 

CTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATCCACGGCCA 

ACTGCTCCCCCGGGATCTGGCAGCTGGACTGCACCCACCTG 

CTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTACATCCTGAAGCTGGCCGGCCGCTGGCCC 

GTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCGGCATCCAGC 

AGGAGTTCGGCAT CCC C TACAACCCC CAGTCC CAGGG CGTGGTGGAGTCCATGAACAAG GAG CTGAAGAAGATCATCGG C C A 

GGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGCAAGGGCGGCATC 

GGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGAAGCAGATCATCA 

AGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTGGAAG 

GGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCG^ 

CAGATGGCCGGCGCCGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

69. 2003C- anc pol.PEP 

FFRENI^FPQGEAREFPSEQTIU\NSPTSREIjQVGRDNPRSEAGAERQGT 

DDTVIjEE INLiPGKWKPKM I GG IGGF I KVRQ YDQ I L I E I CGKKAI GTVIiVGPT PVN I IGRNMIiTQIiGCTIjNF P I S P I ETVP VK 
LKPGMDGPKVKQWPLTEEKIKALTAICEEMEKEGKITKIGPEN^ 

I PHPAGLKKKKS VTVLDVGDAYFS VPLDEGFRKYTAFT I PS INNETPGIRY QYNVLPQGWKGSPAIFQSSMTKILEPFRAQN 

PEIVIYQYMDDIjyVGSDIjEIGQHRAKIEEljREHLIiKWGFTTPDKKHQKEPPFLWMGYEIiHPDKWTVQPIQL 

QKLVGKLNWASQIYPGIKVRQLCKIiLRGAKAIjTDIVPLTEEAELELAENREIL 

QIYQEPFKNLKTGKYAKMRTAHTNDVKQIiTRAVQKIAMESIVIWGK^ 

LVKLWYQLEKEPXAGAETFYVDGAANRETKIGKAGYVTDRG 

GIIQAQPDKSESELVKQIIEQIjIKKEKVYTjSWVPAHKGIGGNEQVDKLVSSGIRKVIjFIjDGIDKAQEEH 

FNLPPIVAICEIVASCDKCQIjKGEAMHGQVDCSPGIWQLDCTHIjEGKI ILVAVHVASGYIEAEVI PAETGQETAYFILKLiAGR 

wpvkvihtdngsnftsaavtcaackwagiqqefgipynpqsqgvvesmnkelkkiig 
giggysageri idi iatdiqtkelqkqi ikiqnfrvyyrdsr 
gkqmagadcvagrqded $ 

200 3 C. anc pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCCCCAGGGCGAGGCCCGCGAGTTCCCCTCCGAGCAGACCCGCGCCAACTCCCCCACCT 
CCCGCGAGCTGCAGGTGGGCCGCGACAACCCCCGCTCCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGACCCTGAACTTCCC 
CCAGATCACCCTGTGGCAGCGCCCCCTGGTGTCCATCAAGGTGGGCGGCCAGATCAAGGAGGCC^ 

GACGACACCGTGCTGGAGG AG ATCAACCTGC CCGGCAAGTGGAAGCC CAAG ATG ATCGGCGG C ATCGG CGGCTTCAT CAAGG 
TGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGG^ 

CATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCGTGAAG 
CTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAA 
AGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTC 
GGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACC 

ATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCCTGG 

ACGAGGGCTTCCGC^^GTACACCGCCTTCACC^TCCCCTCCAT 

GCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCOIGTCCTC^ 

CCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCG 
AGGAGCTGCGCGAGCACCTGCTGAAGTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGAT 
GGG CT ACGAG C TG C AC CCC G ACAAGTGG AC CGTGCAG C C C AT C CAGCTG CCC GAG AAGGACT C C TGG AC CG TG AACG ACAT C 
CAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGT 

GCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAAGGA 
GCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTAC 
CAG ATC T AC CAGG AG C C C TTC AAGAAC CTGAAG ACCGGCAAGTACGCCAAG ATGCG CACCGC CCACACCAACGACGTG AAGC 
AGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCA 
GAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCrGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCC 
CTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGCCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCG 
AGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGATCGTGTCCCTGACCGAGACCACCAACCAGAA 
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GACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCCCTG 

GGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGG 

TGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCATCCGCAA 

GGTGCTGTTCCTTGGACGGCATCGACAAGGCCC AGGAGG AGCACGAG AAGTACCACTCCAACTGGCGCGCCATGG CCTC CGAG 

TTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGCACGGCC 

AGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGATCATCCTGGTGGCCGTGCACGTGGC 

CTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTC^TCCTGAAGCTGGCCGGCCGC 

TGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCGGCA 

TCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAAGATCAT 

CGGCGAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGT^ 

GGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCC^ 

TCATCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 

GGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGACT 

GGCAAGCAGATGGCCGGCGCCGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

70. 2003_CON_D pol.PEP 

FFRENIxAFPQGKAGEIiSSEQTRANSPTSRELRWGGDNPLSETGAERQGTVSFNFPQITLWQR^ 

ADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQIL 

KLK3>GMIX3PKVKQWPIjTEEKIKAIjTEICTEMEK^ 

G I PHP AGLKKKKS VTVLDVGDAYFS VPLDEDFRKYTAFT I PS INNETPG I RYQYNVLPQGWKGSPAIFQSSMTKIIjEPFRKQ 

NPEIVIYQYMDDLYVGSDLEIGQHRTKIEELREHLI^WGF^ 

IQKLVGKJjNWASQIYPGIKVRQLCKXIiRGTKAM 

YQIYQEPFKNLKTGKYARMRGAHTTTOVKQIjTEAVQKIAIESIVIW 

PLVTOiV^QLEKEPIIGAETFYVDGAANRETKLGKAGYVTDRGRQKWPL 

LGIIQAQPDKSESELVSQIIEQLIKKEKVYIiAWVPAHKGIGG 

DFNLPPWAKEIVASCDKCQLKGEAMHGQVDCSPGIWQL^ 

RWPVKVVHTDNGSNFTSAAVKAACWWAGIKQEFGIPYNPQSQGVVESMNKEIjKKIIGQV^ 

ggiggysageriidiiatdiqtkelqkqiikiqnfrvyyrdsrdpi^ 
ygkqmagddcvasrqded $ 

2003_CONJD pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCCCC^GGGC^^GGC^ 
CCCGCGAGCTGCGCGTGTGGGGCGGCGACAACCCCCTGTCC^^ 

CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCATCAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACCGGC 
GCCGACGACACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 
AGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCCACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 
GAACATCATCGGCCGCAACCTGCH<3ACCCAGATCGGCTGCA 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 

GCACCGAGATGGAGAAGGAGGGCAAGATCTCCCGCATCGGCCCCGAGAACCCCTACAACACCCCCATCTTCGCCATC^ 

GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTG 

GGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 

TGGACGAGGACTTCCGGAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAA 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCAAGCAG 

AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCAAGA 

TCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTG 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCAAGCTGCCCGAGAAGGAGTCCTGGACCGTGAACGAC 

ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGCGCC^GCTGTGCAAGCTGCTGC 

GCGGCACCAAGGCCCTGACCGAGGTGATCCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 

GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGGACC 

TACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCATGCGCX5GCGCCCACACCAACGACGTGA 

AGCAGCTGACCGAGGCCGTGCUVGAAGATCGCCATCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCAT 

CCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACA 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 

GCGAGACCAAGCTGGGCAAGGCCGGCTACG1X3ACCGACCGCGGCCGCCAGAAGGTGGTGCCCCTGACCGACACCACCAACCA 

GAAGACCGAGCTGCAGGCCATCAACCTGGCCCTGCAGGACTCCGGCCTGGAGGTGAACATCGTGACCGACTCCC^ 

CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGTCCCAGATCATCGAGCAGCTGATCAAGAA 

AGGTGTACCTGGCCTGGGTGCCCGCCCACLAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCAACGGCATCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACAACAACTGGCGCGCCATGG 

GACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCC^CCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGCACG 
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GCCAGGTGGACl^CTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGGTGATCCTGGTGGCCGTGCACGT 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGGTGCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCG 

G C ATCAAGCAGGAGTTCGG CATC C CCT ACAAC CC CCAGTCC CAGGG CGTGGTGGAGTC CATG AACAAGGAGCTGAAG AAGAT 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGC^GATGGCCGT^ 

GGCGGGATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATC^ 

AGATCATCAAGATC CAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 

GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGTGAAGATCATCC 

TACGGCAAGCAGATGGCCGGCX3ACGACTGCGTGGCCTCCCGCCAGGACGAGGACTAA 

71. 2003 CONFl pol.PBP 

FFRENLAFQQGEARKFPSEQTRANSPASREI^VQRGDNPLSEAGAERRGTTOSI^^ 

GADDTVLEDINIiPGKWKPKMIGGIGGFIKVKQYDHILIEICGHKAIGTVLVGPTPVN 

VKLKPGMDGPKVKQWPLTEEKIKAI/TEICTEMEKEGK^ 

LGI PHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKYTAFTIPSVNNETPGIRYQYN^ FQCSMTKILEPFRT 

KNPDIVIYQYMDDLYVGSDIiEIGQHRTKIEEIiREHIiLKWSFTT^^ 

DIQKLVGKLNWASQIYPGIKVKQI/ZKLLRG 

TYQ I YQEPFKNIiKTGKYAKMRS AHTND VKQLTEAVQKI ALES I VI WGKTPKFRLP I LKETWDTWWTDYWQATW I PEWEFVNT 
PPLVKXjWYQLETEPIVGAETFYVDGASNRETIQCGKAGYVTDRGRQKVVSIjTETTNQKAEIjQ 
AIjGIIQAQPDK5ESEIjVNQIIEQIjIQKBKVYIjSWVT>AHKGI 
SDFNLPPWAKEIVASCDKCQLKGEa^MHGQVDCSPGIWQI^ 

GRWPVKI IHTDNGSNFTSAAVKAACWWAGIQQEFGIPYNPQSQGVVESMNKELKKI IGQVRDQAEHX.KTAVQMAVFIHNFKR 

KGGIGGYSAGERIIDIIATDIQTRELQKQITKIQNFRVYYRDSRDPWKGPAl^ 

DYGKQMAGDDCVAGRQDED$ 

2003_CON_F1 pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCCCCTCCGAGCAGACCCGCGCCAACTCCCCCGCCT 

CCCGCGAGCTGCGCGTGCAGCGCGGCGACAACCCCCTGTCCGAGGCCGGCGCCGAGCGCCGCGGCACCGTGCCCTCCCTGTC 

CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCATCAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 

GGCGCCGACGACACCGTGCTGGAGGACATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGC^TCGGCGGCTTCA 

TCAAGGTGAAGCAGTACGACCACATCCTGATCGAGATCTGCGGCCACAAGGCGATCGGCACCGTGCTGGTGGGCCCCACCCC 

CGTGAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGA 

GTGAAGCTGAAGCCCGGCATGGACGGCCCQAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 

TCTGCACCGAGATGGAGAAGGAGGGCAAGATC^CC^UVGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAA 

GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAG 

CTGGGCATCCCCCACCCCGCC^GCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 

CCCTGGACAAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCGTGAACAACGAGACCCCCGGCATCCGCrACCAGTA 

CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTGCTCCATC 

AAGAACCCCGACATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 
AGATCGAGGAGCTGCGCGAGCACCTGCTGAAGTGGGGCTTCACGACCCCCGACAAGAAGGACG 

GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGACAAGGACTCCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGC 

TGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 

GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGG 

ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCTCCGCCCACACCAACGACG 

TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCCTGGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCC 

CATCCTGAAGGAGACCTGGGACACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 

CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGACCGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCTCCA 

ACCGCGAGACCAAGAAGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAA 

C(^GAAGGCCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTAC 

GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCCAGAAG^ 

AGAAGGTGTACCTGTCCTGGGTGCCCGCCCA.CAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCGC 

CCGCAAGATCCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACAACAACTGGCGCGCCATGGCC 

TCCGACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 

ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCAC 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 
GGCCGCTTOGCCCGTGAAGATCATCCACACCXaACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGC 

CCGGCATCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGC 
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AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCGACCGAC^ 

AGCAGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCGTGTGGAAGGGCCCCGCCAAGCT^CT 

GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCAAGGTGGTGCCC 

GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

72. 2003_CON_F2.pol.PEP 

FFRENIAFQQGEARKFSSEQTRANSPASRELRVRRGDNSLPEAGAERQGTGS 

GADDTVLEDINLPGKWKPKMIGGIGGFIKVRQYDQIPIEICGQKAIGTVLVGPTPW 

VKLKPGMDGPKVKQWPLTEEKIKAIiTEICTEMEKEGKISKIGPE^PYNTPVFAIKKra 

LGI PHPAGLKKKKS VTVXjDVGDAY FS VPLiDKEFRKYTAFT IPS INNET PG IR YQ YNVIjPQGWKG S PAI FQS S MTKI LE P FRA 

KNPEIVIYQYMDDLYVGSDLEIGQHRTKIEEIiREHLLRWGFCT^ 

DIQKLVGKIJTWASQIYPGIRVKHLCKLLRGAKALTDWPL^ 

TYQIYQEPHKOTiKTGKYARRKSAHTNDWQLTEWQKIATEGIVIWGKVPKFRLPIQK^ 
PPLVKLWYQLETEPIVGAETFYVDGAANRETKIjGKAGYVTDRGRQK^ 
ALGIIQAHPDKSESELVNQIIEQLIQKERVYIjSWVPAHKGIGGNEQVDKLVSTGIRK^ 
SDFmjPPVVAKEIVASCT>KCQLKGEAMHGQVDCSPGIW 

GRWPVKI IHTDNGSNFTST\AnCAACWWAGIQQEFGIPYNPQSQGVVBSMNKEl J KKI IGQVRDQAEHLKTAVQMAVF IHNFKR 
KGG IGG Y SAGERI ID 1 1 ATD I QTKELQKQ I TKI QN FRVYFRDSRDPVWKGPAKLLWKGEGAW I QDNNE I KWPRRKAKI IR 
D YGKQMAGDDCVAGRQDED $ 
2003 CON_F2 pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCTCCTCCGAGCAGACCCGCGCCAACTCCCCCGCCT 
CCCGCGAGCTGCGCGTGCGCCGCGGCGACAACTCCCTGCCCGAGGCCGGCGCCGAGCGCCAGGGCACCGGCTCCTCCCTGGA 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTXSGTGACC^TCAAGGTGGGCGGCCAGCTGCGCGAGG 

GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 

TCAAGGTGCGCCAGTACGACCAGATCCCCATCGAGATCTGCGGCCAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 

CGTGAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGC^CCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCrTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 

TCTGCACCGAGATGGAGAAGGAGGGCAAGATCrrCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTC^ 

GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAG 

CTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 

CCCTGGACAAGGAGTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACX3AGACCCCCGGCATCCGCTACCAGT 

GAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCC 

AAGAACCCCGAGATCGTGATCTACCAGTACATCGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 

AGATCGAGGAGCTGCGCGAGCACCTGCTGCX3CTGGGGCTTCACCA 

GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGGCCATCCAGCTC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCCGCGTGAAGCACCTGTGCAAGCTGC 
TGCGCGGCGCCAAGGCCCTGACCGACGTGGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 
GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCACGACCAGTGG 
ACCTACCAGATCTACCAGGAGCCCCACAAGAACCTGAAGACCGGCAAGTACGCCCGCCGCAAGTCCGCCCACACCAACGACG 
TGAAGCAGCTGACCGAGGTGGTGCAGAAGATCGCCACCGAGGGCATCGTGATCTGGGGCAAGGTGCCCAAGTTCCGCCTGCC 
CATCCAGAAGGAGACCTGGGAGATCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 
CCCCCCCTGGTGAAGCrTGTGGTACGAGCTGGAGACCGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 
ACCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGCCCCTGACCGAGACCACCAA 
CCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTAC 
GCCCTGGGCATCATCCAGGCCCACCCCGACAAGTCCGAGTCCGAGCTGGT 

AGCGCGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCACCGGCAT 
CCGCAAGGTGCTGTTCCTGGACGGCATCGACJ^GGCCCAGGAGGAGCACGAGAAGTACCACTCCAACrGGCGCGCCATGGCC 
TCCGACTTGAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTG 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACT 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 

GGCCGCTGGCCCGTGAAGATCATCCACACCGACAACGGCTCCAACTTCACCTCCACCGTGGTGAAGGCCX3 

CCGGCATCCAGCAGGAGTTCGGCATCCCCTAGAACCCCCAGTCCCAGGGCGTGG 

GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGC 

AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCGAGACCAAGGAGCTGCAGA 

AGCAGATCACCAAGATCCAGAACTTCCGCGTGTACTTCCGCGACTCCCGCGACCCCGTGTGGAAGGGCCCCGCCAAGCTGCT 

GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACAACGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 

GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 
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Ill* 

73. 2003_CON_G pol.PBP 

FFRENIiAFQQGEAREFSSEQARANSPTRRELRVRRGDSPLPEAGAEGKGAISL^^ 
ADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQILIEISGKKAIGTNTLVGPTPIN 
KLKPGMDGPKVKQWPLTEEKIKALTEICTC^ 

GI PHPAGLKKKKSVTVLDVGDAYFS VPLiDENFRKYTAFT IPSTNNETPGIRYQYl^ FQS SMTKILEPFRTK 

NPEIVIYQYMDDLYVGSDLEIGQHRAKIEELREHIjIJW 
. IQKLVGKLNWASQIYPGIKVKQLCKLLRGAKAJjTDIVPLTAEAEIiELAENM 
YQIYQEPYKm,KTGKYAKRGSAHTNDVKQiyrEWQKIAT^ 
PLVKLWYRLETEPIPGAETYYVDGAANRETKIjGKAGYVTDKGKQKIITIi 

LGI IQAQPDRSESELVNQI IEQLIKKEKVYIjSWVPAHKG IGGNEQVDKLVSSGIRKVIjFLiDGIDKAQEEHERYHSNWRAMAS 
* DFNLPPIVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQI^CTHLEGKIIIjVAVHVASGYIEAE^ 

RWPVKVIHTDNGSNFTSAAVKAACWWANITQEFGI PYNPQSQGWESMNKEIjKKI IGQVRDQAEHLKTAVQMAVFIHNFKRK 
GGIGGYSAGERIIDIIASDIQTKEIiQKQITKIQNFRVYYRDSRDPIWKGPAKLIiWKGEGAWIQDNNEIKW 
YGKQMAGDD CVAGRQDED $ 

2003_CON_G pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCGAGTTCTCCTCCGAGCAGGCCCGCGCCAACTCCCCCACC 

GCCGCGAGCTGCGCGTGCGCCGCGGCGACTCCCCCCTGCCCGAGGCCGGCGCCGAGGGCAAGGGCGCCATCTCCCTGTCCTT 

CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGGCGGCCAGCTGATCGAGGCCCTGCTGGACACCGGC 

GCCGACGACACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCGAAGATGATCGGCGGCATCGGCGGCTTGAT 

AGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTCCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCAT 

GAACATGATCGGCCGGAACATGCTGACCCAGATCGGCTGCAC^ 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 
GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCATCTTCGCCATCAAGAA 
GAAGGACTCCACCAAGTGGCGCAAGOTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGG 

GGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 
TGGACGAGAACTTCOSCy^GTACACCGCCTTCACCATCCCCTCC^ 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCACCAAG 

AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACG'IXSGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGA 

TCGAGGAGCTGCGCGAGC^CCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCA 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCC^^ 

ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATC^^ 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 

GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGAGCTGATCGCCGAGGTGCAGAAGCAGGGCCTGGACCAGTGGACC 

TACGAGATCTACCAGGAGCCCTACAAGAACCTGAAGACCGGCAAGTACGCCAAGCGCGGCTCCGCCCACACCAACGACGTGA 

AGCAGCTGACCGAGGTGGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTT 

CCGCAAGGAGACCTGGGAGGTGTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 

CCCCTGGTGAAGCTGTGGTACCGCCTGGAGACCGAGCCCATCCCCGGCGCCGAGACCTACTACGTGGACGGCGCCGCCAACC 

GCGAGAC CAAGCTGGGCAAGGCCGG CTACGTGACC G ACAAGGG CAAGCAGAAGATCAT CAC CCTGACCG AG AC CAC CAACCA 

GAAGGCCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCC 

CTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACTCCAACTGGCGCGCCATGGCCTCC 

GACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTC3AAGGGCGAGGCCATGCACG 

GCCAGGTGGACTGCT*CCCCCGGGATCTGGCAGCTGGACTGCACCCACCTGGAGGGC^AGATCATCCTGGTGGC 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCA 

ACATCACCCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAAGAT 

GATCGGCCAGGTGCGCGACCAGGCCGAGCACCTC^ 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCTCCGACATCCAGACCAAGGAGCTGCAGAAGC 
AGATCACCAAGATCCAGAACl^CCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 
GAAGGGCGAGGGCGCCGTGGTGATCGAGGACAACAACGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGAC 
TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

\\n 

9 74. 2003_CON_H pol.PBP 

FFRENIiAFQQREARKFSPEQARANSPTSRELRVRRGDDPLSEAGAEGQGTSLSFPQITLW 

DDTVLEEINLPGKWKPKM I GG IGGF I KVRQYEQVAI E I CGKKAIGTVLVGPTP VN I IGRNILTQIGCTIiNFPISPIETVPVK 
LKPGMDGPKVKQWPIiTEEKIKAIjTE ICIEMEKEGKI SKIGPENPYNTP IFAI KKKDSTKWRKLVDFRBLNKRTQDFWEVQLG 
IPHPAGLiKKKKS VSVLDVGDAYFS VPLDKDFRKYTAFT I PS INNETPG IRYQYNVLPQGWKGS PAI FQSSMTKILEPFRKQN 
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PEMI I YQYMDDIiYVGSDLEIGQHRAKIEE^^ 
q^VGKLJTOASQIYPGIKVXQLCKLLRGAKAIjTDIV 

qiyqepfk^ktgkyakmrtahtndvkqlteavqkiatesiviwgkipkfrlpiqke™ 

LVKiWYQLETEPIAGAETYYVDGAANRETKIGKAGYV^ 
GIIQAQPDKSESELVNQIIEELIKKEKVYLSWVPAHKGIOT^ 

FNIiPPlVAKEIVASCDKCQIiKGEAMHGrQVDCS PGI WQIJXTTHLEGKVIIjVAVHVASGYIEAEVI PAETGQETAYF ILiKLAGR 
* WPVKMIHTDNGSNFTSAAVKAACWWADIQQEFGIPYHPQSQGVVESMNKELiKKI I GQVRDQAEHXiRT AVQMAVF I HNFKRKG 
GIGGYSAGERIIDIIATDIQTKELQKQISKIQKFRVYYIU3SRDPIW 
GKQMAGDDCVAGRQDED$ 

2003 CON H pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGCGCGAGGCCCGCAAGTT^ 

CCCGCGAGCTGCGCGTGCGCCGCGGCGACGACCCCCTGTCCGAGGCCGGCGCCGAGGGCCAGGGCACCTCCCTGTCCTTCCC 

CCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGAGGGCCAGCTGCGCGAGGCCCTGCTGGACACCGGCGCC 

GACGACACCGTGCTGGAGGAGATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGG 

TGCGCCAGTACGAGCAGGTGGCCATCGAGATCTGCGGGAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAA 

CATCATCGGCCGCAACATCCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCGTGAAG 

CT«<3AAGCCCGGCATGGACGGCCCGAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAA 

TCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCGAT^ 

GGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCA 

ATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGTCCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCCTGG 

AGAAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCG 

GCnXjCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCAAGCAGAAC 
CCCGAGATGATCATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCG 
AGGAGCTGCGCGCCCACCrroCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCC 

GGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCGTGAAGCTGCCCGAGAAGGACTCCTGGACCGTGAACGACATC 

CAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATC^ACCCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGCTGCGCG 

GCGCCAAGGCCCTGACCGACATCGTGCCGCTGACCAAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGA 

GCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCCCGACCAGTGGACCTAC 

CAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGC 

AGCTGACCGAGGCCGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCi^GATCC CCAAGTTC CG CCTGCC CATC CA 

GAAGGAGACCTGGGAGACCTGGTGGACCGAGCACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACA 

CTGGTGAAGC*TGTGGTACCAGCTGGAGACCGAGCCCATCGCCGGCGCCGAGACCTACTACGTGGACGGCGCCGCCAACCGCG 

AGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGCGGCAAGCAGAAGGTGGTGTCCCTGACCGAGACCACCAACCAGAA 

GACCGAGCTGCAGGCCATCTACCTGGCCCTGCAGGACTCCGGCCTGGAGGTGAACATCGTGACCGACTCCCAGTACGCCCTG 

GGCATC^TCCAGGCCGAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGGAGCTGATCAAGAAGGAGAAGG 

TGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCATCCGCAA 

GGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACAACAACTGGCGCGCCATGGCCTCCGA^ 

TTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGCAC 

AGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTOT 

CTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGCCGC 

TGGCCCGTGAAGATGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTG 

TCGAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAAGATCAT 

CGGCCAGGTGCGCGACCAGGCCGAGCACCTGCGCACCGCCGTGCAG 

GGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATGATCGCCACCGACATCCAGACCAAGGAGCTGCAGAAGCAGA 
TCTCCAAGATCCAGAAGTTCCGCGTGTACTrACCGCGACTCCCGCGACCCGATCTGGAAG 

GGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGACTAC 
GGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

I, //*75. 2003 CON 01 AE pol.PEP 

(f ' FF RENIJ^QQGKAGEFSSEQTRANSPTSRKI^ 
Dt QADDTVLEDIl^PGKWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGT^ 
7/ VTLKPGMDGPKVKQWPLTEEKIKALTEICKEMEEEGKISKIGPENPYNTPVFAIKK^ 

IXSIPHPAGLKKKKSVTVIjDVGDAYFSVPIiDESFRKYTAPTIPSINNETPGIRYQYNVXiPQGWKGSPA 
KNPEMVIYQYMDDI*YVGSDLEIGQHRTKIEELRAHLLSWGFTT 

D I QKLVGKLiNWAS Q I YAG I KVKQLjCKLiIiRGAKALiTD I VPLTEEAELEIjAENRE I IjKTPVHGVYYDPSKDLVAE VQKQGQDQW 
TYQIYQEPFKNLKTGK^ARKRSAHTNDVRQLTEWQKIATESIVIWGKTPKFRL 
PPI#VKLWYQIjEKDPIVGAETFYVDGAASRETKLGKAGYVTDRGRQKVVSIjTETTNQK^ 
AIjGI IQAQPDRSESEVVNQI IEELI KJCEKVYLSWVT^AHKGIGGN^ 
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SDFNliPPIVAKEI VANCDKCQLKGKAMHGQVDCS PG IWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFIiIjKIiA 
GRWP\nCVIHTDNGSNFTSAAVKAACWWANVRQEFGIPYNPQSQGVVESMNKEIjKKI I GQVREQAEHL»KT AVQMAVF I HNFKR 
KGGIGGYSAGERI IDIIATDIQTKELQKQITKIQNFRVYYRDSRDPIWKGPAKXiLWKGEGAVVIQDNSDIKVVPRRKAKI IR 
DYGKQMAGDDCVAGRQDED $ 

2003 CON 0I_AE pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCAAGGCCGGCGAGTTCTCCTCCGAGCAGACCCGCGCCAACTCCCCCACCT 
CCCGCAAGCTGGGCGACGGCXMCCGCGACAACCTGCTGACCGAGGCCGGCGCCGAGCGCCAGGGCACCTCCTCCTCCTTCTC 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 
GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 
TCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 
CGTGAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTO 

GTGACCCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 
TCTGCAAGGAGATGG AGGAGGAGGG CAAGAT CT C CAAGAT CGG C C CCG AGAACC C CT ACAACACCCCCGTGTT CG CCATC AA 
GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGQACTTCTGGGAGGTGCAG 
CTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGC 
CCCTGGACGAGTCCTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAACAACGAGACCCCCGGCATCCGCTACCAGTA 
CAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCGAGTCCTCCATG^ 

AAGAACCCCGAGATGGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTC CGACCTGGAGATCGGCCAGCAC CGCACCA 

AGATCGAGGAGCTGCGCGCCCACCTGCTGTCCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCT 

GTGGATGGGCTACGAGCTGCACCCCGACCGCTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGC^GCTGAACTGGGCCTCCCAGATCTACGCCGGCATCAAGGTGAAGCAGCTC 

TGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 

GAAGACCCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGGTGGCCGAGGTGCAGAAGCAGGG.CCAGGACCAGTGG 

ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCAAGCGCTCCGCCCACACCAACGACG 

TGCGCCAGCTGACCGAGGTGGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCC 

CATCCAGCGCGAGACCTGGGAGACCTGGTGGATGGAGTACTGGCAGGCGACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 

CCCCCCCTIX3GTGAAGCTGTGGTACGAGCTGGAGAAGGACCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCT 

CCCGCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAA 

CCAGAAGACCGAGCTGGACGCC^TCCACCTGGCCCTGCAGGACTCCGGCrrCCGAGGTGAACATCGTGACCGACTCCCAGT^ 

GC CCTGGGCATCATCCAGGCCCAG CCCGACCGCTCCGAGTCCGAGGTGGTGAACCAGATCATCGAGGAGCTGATCAAGAAGG 

AGAAGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCAT 

CCGCAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCX3C 

TCCGACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCAACTGCGACAAGTGCCAGCTGAAGGGC 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGGTGA 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGG 
CCAACGTGCGCCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
GATCATCGGCC^GGTGCGCGAGGAGGCCGAGC^CCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGC 
AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCCAGACCAAGGAGCTGCAGA 
AGCAGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 



in 



76, 2003 _CON_0 2 _AG pol . PBP 
FFRENIiAFQQGEARKFSSEQTGTNSPTSREIiWDGGRDNL,LSEAGTEGQGTI S SFNFPQITLWQRPLVTVRIGGQIilEALIJDT 
GADDTVLEEINIiPGKWKPKMIGGIGGFIKVT^QYDQILIEICGKKAIGTVLVGPTPV^ 
VIOjKPGMDGPKVKQWPLTEEKIKALTDICTEMEKEGKISKIGPENPYNTPVT 

XjGI PHPAGLKKKKSVTVLDVGDAYFSVPIjDKDFRKYTAFTI PSVNNETPG IRYQYNVLPQGWKGSPAI FQASMTKIXiEPFRT 
KNPEIVIYQYMDDLWGSDLEIGQHRAKIEEIiREHLIiRWGFTTPDKKHQKEPPFLWMGYEIJi 

D I QKLVGKLNWASQ I YAGI KVKQLiCKLiIjRGAKALTD I VTIjTEEAEI»EL*AENRE I LKEP VHGVYYDPTKDLi X AE I QKQGQDQW 
TYQIYQEPFKNLKTGKYAK24RSAHTNDVKQLTEVVQKVA 

P PLVKIiWYQIiE KD P I VGAETF YVDGAANRETKIjG KAG YVTDRGRQKWS LTETTNQKTEIiHAI HIiAIjQD S G S EVN I VTD S Q Y 
ALGI IQAQPDRS ESELVNQI I EKL IEKDKVYLSWVPAHKGIGGNEQVDKLVSNG IRKVLFLDGIDKAQEEHERYHSNWRAMA 
S DF*NLiPP I VAKE I VAS CDKCQIiKGEAMHGQVDCS PG I WQIiDCTHLiEGKI I LVAVHVAS GY I EAEV I PAETGQETAY F I LKLA 
GRWPVKVIHTDNGSNFTSAAVKAAC^ANVTQEFGIPYNPQSQGVVES^ 

KGGIGGYSAGERI ID I IASDIC^KELQKQITKIQNFRVYYRDSRDPIWKGPAICIjIjWKGEGAVVIQDNSDIKVVPRRKAKI IR 
DYGKQMAGDDCVAGRQDED$ 
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2003_CON_ 02_AG pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCTCCTCCGAGCAGACCGGCACCAACTCCCCCACCT 
CCCGCGAGCTGTGGGACGGCGGCCGCGACAACCTGCTGTCCGAGGCCGGC^^ 

CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGCGCATCGGCGGCCAGCTGATCGAGGCCCTGCTGGACACC 
GG CGCCGACGACACCGTGCTGGAGGAGAT CAACCTGCCCGGCAAGTGGAAGCCCAAGATG ATCGGCGGCATCGGCGGCTTCA 
TCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCC 
CGTGAACATCATCGGCCGCAACATGCTGACCCAGATCXK3CTGGACCCTGAACTTCC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACC 
TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 
GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTC CGCGAGCTGAACAAGCG CACCCAGGACTTCTGGGAGGTGCAG 
CTGGGCATCCCCC^CCCCGCCGGCCTGAAGAAGAAGAAGTCCGTC^ 

CCCTGGACAAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCGTGAACAACGAGACCCCCGGCATCCGCTACCAGTA 

GAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCGATCTTCCAGGCCTCCATGACCAAGATCCTGGAGCC 

AAGAACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCA 

AGATCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCACCAC 

CTOGATGGGCTACGAGCTGCACC CCGACAAGTGGACCGTGCAGCCCATCCAGCTG CCCGAGAAGGACTCCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACGCCGGCATCAAGGTGAAGCAGCTGTGCAAGCTGC 

TGCGCGGCGCCAAGGCCCTGACCGACATCGTGACCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCT 

GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCACCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGG 

ACCTACCAGATCTACCAGGAGCCCI^CAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCTCCGCCCACACCAACGACG 

TGAAGCAGCTGACCGAGGTGGTGCAGAAGGTGGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCC 

CATCCAGCGCGAGACCTGGGAGGCCTGGTGGATGGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 

CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGACCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 

ACOTCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCG^ 

CCAGAAGACCGAGCTGCACGCCATCCACCTGGCCCTGCAGGACT 

GCCCTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGAAGCTGATCGAGAAGG 

ACAAGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCAACGGCAT 

CCGCAAGGTGCTGTTCCTGGACGGCZATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACTCCAACTGGCGCGCGA^ 

TCCGACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 

ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGC^ 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGGTGATCGACACCGACAACGGCTCGAACTTC^CCTCC^ 

CCAACGTGACCCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTC^ 

AAGGGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCTCCGACATCCAGACCAAGGAGCTG 
AGCAGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCGAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

77. 2 003 _CON_ 0 3 __AB pol.PBP 

FFRENLAFQQREARKFS S EQTRAI S PTS RKL WDGGRDNPLPETGTERQGTAS S FNFPQ I TIjWQRPIj VTVR I GGQLKEALiLiDT 

GADDTVLEDIOT,PGKWKPKMIGGIGGFIKTOQYDQILIEI^ 

VTLKPGMDGPKVKQWPLTEEKIKALTDICK^ 

IXSIPHPAGLKKKKSVTVLDVGDAYFSVPLDQDFRKYTAFTIPSTNNETPGIRYQYN^ 
QNPEIVIYQYhTODLYVGSDLEIGQHRTKIEEIJlEHLIiRWGFTTPDKKHQKEPPFLWMGYEIiH 

DIQKLVGKIjNWASQI YAG I KVRQLCKIjIjRGAKAIjTEVI PLTAEAELEIJVENREIIjKEPVHGVYYDPS KDLiVAEIQKQGQGQW 
TYQIYQEPFKNIjKTGKYARIjRGAHTNDVKQIjTEAVQKIATESIVIW 
PPLVKLVryQLEKEPIVGAETFYVDGAANRETKSGKAGYVTDRGRQKW 
ALGIIQAQPDKSESELVSQIIEQLIKKEKVYIAWVPAH^^ 

SDFNIiPPVVAKE I VASC^KCQLKGEAMHGQVDCSPGIWQIjDCTHIjEGKI IIjVAVHVASGY I EAEVI PAETGQETAYFVLKLA 
GRWPVKI IHTDNGSNFISTAVKAACWWAGI KQEFGIPYNPQSQGWESMNKQLiKQI IGQVRDQAEHLKTAVQMAVFIHNFKR 
KGGIGGYSAGERI IDT IATDIQTKEI/QKQI IKIQNFRVYYRDSRDPIWKGPAKIjLiWKGEGAVVIQDNNDIKVVPRRKAKI IR 
DYGKQMAGDDCVASRQDED $ 

2003 CON 0 3 _ AB pol.OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGCGCGAGGCCCGCAAGTTCTCCTCCGAGCAGACCCGCGCCATCTCCCCCACCT 
CCCGCAAGCTGTGGGACGGCGGCCGCGACAACCCCCTGCCCGAGACCGGCACCGAGCGCCAGGGCACCGCCTCCTCCTTCAA 
CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGCGCATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 
GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 
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TCAAGGTGCGCCAGTACGAC CAGATCCTGATCGAGATCTGCGGCAAGAAGG CCATCGGCACCGTGCTGGTGGGCCCCACCCC 

CGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACC 

GTGACCCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGACA 

TCTGCAAGGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCA 

GAAGAAGGACTCC^CCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACA^ 

CTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTC 

CCCTGGACCAGGACTTCCGC^GTACACCGCCTT<^^ 

GAACGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATG^ 

CAGAACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 
AGATCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGAC^ 

GTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGTGCTGCCCGAGAAGGACTCCTGGACCGTGAAC 

GACATC GAGAAGCTGGTGGGCAAGCTroAACTGGGCCTCCCAGATCTACGC CTGTGCAAGCTGC 

TGCGCGGCGCCAAGGCCCTGACCGAGGTGATCCCCCTGACCGCCGAGGCCGAGOTX5GAGCTGGCCGAGAACCGCGAGATCCT 

GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCGAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGG 

ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCCGCCTGCGCX^ 

TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCACCGAGTCCAT^ 

CATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACC 
CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCA 
ACCGCX3AGACCAAGTCCGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGACACCACCAA 
CCAGAAGACCGAGCTGCAGGCCATCCACCTGGCCCTGCAGGACTCCGGCCTGGAGGTGAACATCGTGACCGACTCCCAGTAC 
GCCCTGGGCATCATCCAGGCCCAGCCCGAGAAGTCCGAGTCCGAGCTGGTGTCCCAGATCATCGAGCAGCTGATCAAGAAGG 
AGAAGGTGTACCTGGCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCGCCGGCAT 
CCGCAAGGTGCTGTTCCTGGACGGGATCGACAAGGCCCAGGAGGCCCACGAGAAGTAC 

TCCGACTTCAACCTGCCCCCCGTGGTGGCGAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCAT^ 
ACGGCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCAC^ 

CGTGGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCrrACTTCGTGCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGATCATCCACACCGACAACGGCTCCAACTTC 

CCGGCATCAAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGCAGCTGAAGC^ 
GATCATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCC^ 

AAGGGCGG CATCGG CGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCACCGACATCCAGACCAAGGAG CTGCAGA 
AGCAGATCATCAAGATCCAGAACrTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACAACGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCTCCCGCCAGGACGAGGACTAA 

^78 . 2 0 0 3_CON_0 4_CPX pol.PEP 

FFRENVAFQQREARKFSSEQARANSPARRELRDERGDNIiLSEAGTEGQGTISFNFPQITLWQRPLVTIKIGGQ 

ADDTVLEEINLPGKWKPKMIGGIGGFIKVRQYDQIPIEIC^ 

KLKPGMDGPKTOQWPLTEEKIKAIjTEICTE^KEGKISKI^ 

giphpaglkkkksvtvldvgdayfsvpijdpefrkytaft^ 

NPEIVIYQYMDDLYVGSDLEIGQHRAKIEEIjraHIjL^ 

IQKLVGKLNWASQIYPGIKVKQLCKLLiRGAKAIjTDIVPIjTTEAEIjELAENR 

YQIYQEPYKNLKTGKYAKTRSAHTNDVRQIiTEAVQKIAMECIVIWGKTPKFRLPIQ 

PLVKLWYQLETDPIAGAETFYVIXSAASRETKQGKAGYV^ 

IGIIQAQPDRSESDLVNQIIEQIilQKDKVYLSVA^AHKGIGGNEQVDKIiVSNGIRKVljFIjDGIDKA 

DFNLPPWAKE IVAS CNKCQLiKGEAMHGQVDCSPG IWQLDCTHIjEGKI ILVAVHVASGY IEAEVI PAETGQETAYFI LKLAG 
RWPVKIIHTDNGPNFTSAAVKAACWWADIQQEFGIPYNPQSQGVVESMNKELKKI 

GGIGGYSAGERI IDI IASDIQTKELQKQITKIQNFRVYYI^SRDPIWKGPAKLIjWKGEGAVVIQDNSDIKVVPRRKAKI IRD 
YGKQMAGDDCVAGRQDED$ 

2003_CON_04_CPX pol.OPT 

TTCTTCCGCGAGAACGTGGCCTTCCAGCAGCGCGAGGCCCGCAAGTTCTCCTCCGAGCAGGCCCGCGCCAACTCCCCCGCCC 
GCCGCGAGCTGCGCGACGAGCGCGGCGACAACCTGCTGTCCGAGGCCGGCACCGAGGGCCAGGGCACCATCTCCTTCAACTT 
CCCCCAGATCACCCTOTGGCAGCGCCCCCTGGTGACCATCAAGATCGGCGGCCAGATCCGCGAGGCCCTGCTGGACACCGGC 
GCCGACGACACCGTGCTGGAGGAGATCAACCTGCCCGGGAAGTGGAAGCCCAAGATGAT 

AGGTGCGCCAGTACGACCAGATCCCCATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 
GAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTC 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 

GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTAGAACACCCCCATCTTCGCCAT^^ 

GAAGAACTCCACCCGCTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGC^ 
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GGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 

TGGACCCCGAGTTCCGC^GTACACCGCCTTC^CCATC^ 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCT^ 

AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGA 
TCGAGGAGCTGCGCGAGCACCTGCTGCGCTGGGGCTTCTCCACCCCCGACA 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGGCCGAGAAGGACTCCTGGACCGTGAACGAC 

ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGAAGCAGCTGTGC 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCACCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCK1AA 

GGAGCCCGTGCACGGCGCCTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGGACC 

TACCAGATCTACCAGGAGCCCTAGAAGAACCTGAAGACCGGCAAGTACGCGAAGACCCGCT 

GCCAGCTGACCGAGGCCGTGC^GAAGATCGCCATGG^^ 
CCAGAAGGAG^CCTGGGACACCTGGTGGACCGAGTACTGGCAGGC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGACCGACCCCATCGCCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCTCCC 

GCGAGACCAAGCAGGGCAAGGCCGGCTACGTGACCX5ACCGCGGCCGCCAGAAGGTGGTGTCCCTGTCCGAGACCACCAACCA 

GAAGACCGAGCTGCAGGCCATCTACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 

ATCGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGACCTGGTGAACCAGATGATCGAGCAGCTGATCCAGAAGGACA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCAACGGCATCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACAACAACTGGCGCGCCATGGCCTCC 

GACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCAA^ 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCAC^ 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCGAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGATCATCCACACCGACAACGGCCCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCG 

ACATCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAAG 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGCAAG 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCTCCGACATCCAGACCAAGGAGCTGCAGAAGC 

AGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 

GAAGGGCGAGGGCGCCGTGGTCATCCAGGACAACTCC^ 

TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

^19. 2003 CON 06 CPX pol . PEP 

FFRENLA-FQQGEAREFSSEQARANSPTRREIxRVRRGDSPLPEAGAEGQGAISLSFPQITIjW 

MMDTVLEDINLPGKWKPKMIGGIGGFIKTO^ 

KLKPGMDGPKVKQWPLTEEKI KAXiTE ICTEMEKEGKI SKIGPENPYNT PI FAI KKKDSTKWRKLVDFRELNKRTQDFWEVQLi 
GIPHPAGLKIOaCSVTVIJ5VGDAYFSVPI^EDFRKYTAFTIPSINNETPGIR^ 

NPEIVIYQYMDDLYVGSDLEIGQHRAKIEEIiREHIiLKWGFTTPDKKHQKEPPFLWMGYEI^PDK^ 

IOKLiVGKIiNWAS Q I YPGI KVKQliCKIjIjRGAKAIjTD I VPIiTAEAELELAENRE I LKEPVHGVYYDPS KDL I AE I QKQGQGQWT 
ynj YQEPHKNIjKTGKYARIKSAHTNDVKQIjTEAVQKIALES I VIWGKTPKFRIjP IQKETWETWWTEYWQATW IPEWEFVNTP 
PLVKLTOQLETEPIVGAETFYVDGAANRETKKGKAGYVTDRGRQKWSLTETTO 
IXSIIQAQPDKSESELVNQIIEQIjIKXEKVYLSWVPAHKGIGGNEQ 
DFNLPPIVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQI^ 

RWPVKVIHTDNGSNFTSAAVKAACWWANITQEFGIPYNPQSQGVVESMNKELKKIIGQ 

GGIGGYSAGERIIDIIASDIQTKELQKQITKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAWIQDNSEIKWPRRK^ 
YGKQMAGDDCVAGRQDED$ 

2003 CON 06 CPX pol. OPT 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCGAGTTCTCCTCCGAGCAGGCCCGCGCCAACTCCCCCACCC 

GCCGCGAGCTGCGCGTGCGCCGCGGCGACTCCCCCCTGCCCGAGGCCGGCGCCGAGGGCCAGGGCGCCATCTCCCTGTCCTT 

CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGCGCATCGGCGGCCAGCTGATCGAGGCCCTGCTGGACACCGGC 

GCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 

AGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAA.GGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 

GAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCGTG 

AAGCTGAAGCCC GGCA.TGG ACGGC C CCAAGGTGAAGCAGTGGCCC CTG ACCGAGGAGAAG ATCAAGGCCC TG ACCGAG ATCT 

GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCATCTTCGCCATO 

GAA.GGACTCCACCAAGTGGCX3CAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACT 

GGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 
TGGACGAGG ACTTC CGCAAGTAC AC CG CCTTCACCATCC CC^CCATCAACAACXSAGACCC CCGGCATCCG CTACCAGT ACAA 
CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGATCAAGATCCTGGAGCCCTTCCGCATCAAG 
AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGA 
TCGAGGAGCTGCGCGAGCACCTGCTGAAGTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTC 
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GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGACAAGGACTCCTGGACCGTGAACGAC 
ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCA^ 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
(^AGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGGACC 
TACCAGATCTTACCAGGAGCCCCACAAGAACCTGAAGACCGGCAAGTACGCCCGCATCA^ 

AGCAGCTGACCGAGGCCGTGCAGAAGATCGCCCTGGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCAT 

CCAGAAGGAGACCTGGGAGACCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGACCGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 

GCGAGACCAAGAAGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGGTGTCCCTGACCGAGACCACCAACCA 

GAAGACCGAGCTGCAGGCCATCAACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 

CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCACCGGCATCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGACCACGAGCGCTACCACTCCAACTGGCGCGCCATGGCCTCC 

GACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGCACG 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGATCATCCTGGTGGCCGTGCACGT 

GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCA 

ACATCACCCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACA 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAAC^ 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCTCCGACATCCAGACCAAGGAGCTGCAGAAGC 

AGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTC 

GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGAC 

TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 
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80. 2003 CON 0 8 BC pol.PEP 

FFRE ILAFPQGEAREFPPEQTRANS PTSRELQVRGDNPS SEAGTBRQGTIjNFPQITIjWQRPLVS IKVGGQIKEALIjDTGADD 
TVLEEVNLPGKWKPKMIGGIGGFIKVRQYEQIP^ 
PGITOPKVKQWFLTEEKIKALTAICDEME!^ 
HPAGLKKK^VTVLDVGDAYFSVPIJ3KDFRKYTACT^ 
IVIYQYITODLYVGSDLEIGQHRTKIEELREHIJ^ 

LVGKIjNWASQI YPGI KVRQLCKXiLiRGAKAIjTD I VPLTEEAELELAENRE ILiKEPVHGAYYDPSKEIj I AE I QKQGQDQWTYQI 
YQEPFKNTj KTGKYAKMRTAHTNDVKQLiTEAVQKI AME S IV I WGKI PKFRLP IQKETWETWWTDYWQATW I PE WEFVNTP PLV 
KLWYQLEKDPIAGVETFYVDGAANRETKIGKAGYVTDRGRrc^ 
IQAQPDKSESEIATOQIIEQLIKKERVYLSWVPAHKGIGGra 

IjPPI VAKEI VASCDQCQbKGEAMHGQVDCSPGI WQIiDCTKLEGKI I LVAVHVASG Y IEAEVI PAETGQETAYF I LKLiAGRWP 
VKVIHTDNGSNF^SAAVKAAC^AGIQQEFGIPYNPQSQGVVESMNKELKKLIG 

GGYSAGERIVDI IATDIQTREL.QKQI IKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAWIQDNSDIKWPRRKAKI IKDYGK 
QMAGADCVAGRQDED $ 

20 03 CON 08 BC pol.OPT 

TTCTOCCGCGAGATCCTGGCCTTCCCCCAGGGCGAGGCCCGCGAGTTCCCCCCCGAGC^GACCCGCGCC^CTCCCCCACCT 

CCCGCGAGCTGCAGGTGCGCGGCGACAACCCCTCCTCCGAGGCCGGCACCGAGCGCCAGGGCACCCrrGAACTTCCCCCAGAT 

CACCCTGTGGCAGCGCCCCCTGGTGTCCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGACACCGGCGCCGACGAC 

ACCGTGCTGGAGGAGGTGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCC 

AGTACGAGCAGATCCCCATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGTGAACATCAT 

CGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCG 

CCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGACGAGA 

TGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGACAACCCCTACAACACCCCCATCTTCGC^^ 

CTCCAAGTGGCGCAAGCT'GGTGGACTTCCGCGAGCTGAAGAAGCGCACCCAGGACTTCT^ 

CACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCCTGGACAAGG 
ACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCGTGAAC^ 

CCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTGCTCCATGACCAAGATCCTGGAGCCCTTCCGCAAGCAGAACCCCGAC 
ATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCAAGATCGAGGAGC 

TGCGCGAGC^CCreerGAAGTGGGGCrcC^ 

CGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGAGAAGGACTCCTGGACCGTGAACGACATCCAGAAG 
CTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGGATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCA 
AGGCCCTGACCXSACATCGTGCCCCTTCACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGC^ 

GCACGGCGCCTACTACGACCCCTCCAAGGAGCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGGACCTACCAGATC 
TACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCWVGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGA 
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CCGAGGCCGTGCAGAAGATCGCCATGGAGTCCATCGTGATCTGGGGCAAGATCCCCAAGTTCCGCCTGCCCATCCAGAAGGA 

GACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCC7VCCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTG 

AAGCTGTGGTACCAGCTGGAGAAGGACCCCATCGCCGGCGTGGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCA 

AGATCGGCAA<3GCCGGCTACGTGACCGACCGCGGCCGCAAGAAGATCGTGTCCCTGACCGACACCACCAACCAGAAG^ 

GCTGCAGGCCATCTACATCGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCC^ 

ATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGCGCGTGTACC 

TGTCCIX3GGTGCCCGCCCACAAGGGCATCGGCGGCAATOAGCAGGTGGAC^^ 

GTTCCTGGACGGCATCGACAAGGCCC^GGAGKSAGCACGAGAAGTACCACTCCAACTGGCGTC 

CTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACCAGTGCCAGCTGAAGG 

ACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGATCATCCTGGTGGCCGTGCA 

CTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGCCGCTGGCCC 

GTGAAGGTGATCCACACCGACAACGGCTCGAACTTCACCTCCGCC 

AGGAGTTCGGC^TCCCCTACAACCCCCAGTCCGAGGGCGTGGTGGAGTCCATG 

GGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGCAA 

GGCGGCTACTCCGCCGGCGAGCX5CATCGTGGACATCATCGCCACCGACATCCAGACCCGCGAGCTGCAGAAGCAGATCATCA 

AGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTGGAAGGGCGA 

GGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCAAGGAC 

CAG ATGGCCGGCGC CGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

<> 2003 CON 10 CD pol.PBP 

FFRENliAFQQRKARELPSEQTRANSPTSRELRVWGGDNTLSE 
ADDTVLEEMNLPGKWKPKMIGGIGGFIKVRQYDQIDIEICGYKAI^ 

KLKPGMDGPKVKQWPIjTEEKI KALTE ICTEMEKEGKI SRIGPENPYNTP IFAIKKKDSTKWRKLVDFREIjNKRTQDFWEVQL 

GIPHPAGLKKXKSVTVIjDVGDAYFSVPLYEDFRKYTAFTIPSINOT^ 

NPEWVIYQYMDDLYVGSDIjEIGQHRIKIEELRGHLIjKWGFTTPDKKHQK^ 

I QKLVGKLNWAS QI YPG I KVRQLCKLLRGAKAI/TD I VPLTEEAEIjEIiAENRE IIiKEPVHGVYYDP S KDL I AE I QKQGQDQWT 
YQIYQEPHKNLKTGKYAKRRTAHTNDVKQLTEA 

PLVIOjWYQIjEKEPIVGAETFYVDGAANRETKIjGKAGYVTDRGRQKVISITDTTNQKTELQ 
LGI IQAQPDKSESELVNQIIEQLIKKEKVYLSWVPAHKGIGGNEQVDKLV 

DFNLPPWAKEIVASCDKCQLKGEAL.HGQVDCSPGIWQIjDCTHIjEGKVILVAVHVASGYIEAEV 

RWPVKVVHTDNGSNFTSAAVKAACWWAGIKQEFGI PYNPQSQGWESMNKEIjKKI IGQVRDQAEHLiKTAVQMAVFIHNFKRK 
GGIGGYSAGERI IDI I ATDIQTKELQKQI IKIQNFRVYYRDSRDPIWKGPAKLLWKGEGAWIQDNSDIKVVPRRKVKI IKD 
YGKQMAGADCVASRQDEDQ 

2003 _CON_l 0 CD poX.OPT 

TTCTCCCGCGAGAACCTGGCCTTCCAGC^GCGC^<^ 

CCCGCGAGCTGCGCGTGTGGGGCGGCGACAACACCCTGTCCGAGACCGGCGCCGAGCGCCAGGGCGCCGTGTCCCTGTCCTT 
CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGAAGATCGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACCGGC 
GCCGACGACACCGTGCTGGAGGAGATGAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 
AGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCTACAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 
GAACATGATCXK3CCGCAACCTGCTGACCGAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAG 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTX5ACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 
GGACCGAGATGGAGAAGGAGGGCAAGATCTrCCCGCATCGGCCCCGAGAACCCCTAC^ 

GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTG 

GGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGTCCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCTCCGTGCCCC 

TGTACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCAAGAACGAGACCCCCGGCAT 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCGATGACCAAGATC^^ 

AACCCCGAGATGGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGG 

TCGAGGAGCTGCGCGGCCACCTGCTGAAGTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTG 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGAGAAGGACTCCTGGACCGTGAA 

ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGCATCAAGGTGCGCCAGCT^ 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCOGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGA^ 
GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGACCAGTGGACC 
TACC^GATCTACCAGGAGCCCCACAAGAACCTGAAGACCGGCy^GTACGCGAAGCGCCGC^ 

AGGAGCTGACCGAGGCCGTGCAGAAGATCGCCCAGGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCAT 
CCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 
CCCC^rGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCGTGGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 
GCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGTGATCTCCATCACCGACACCACCAACCA 
GAAGACCGAGCT?GCAGGCCATCAACCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 
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CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTC 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACAACAACTGGCGCGCCATG 

GACTTCAACCTGCCCCCCGTGGTGGCC/^GGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCCTGCAC^ 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGGTGATCCTGGTGGCCGTGCACGT 

GGCCTCCGGCTACJVTCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCCTGCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGGTGGTGCACACCGACAACGGCTCCAACTTCACCTCCGCCGCC 

GCATCAAGGAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGT^ 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCA 
GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCA^ 

AGATCATCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 
GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGTGAAGATCATCAAGGAC 
JACGGCAAGCAGATGGCCGGCGCCGACTGCGTGGCCTCCCGCCAGGACGAGGACCAG 

82. 2003_CON_11_CPX pol.PEP 

FFRENIJ^QCKSEAREFSPEQARANSPTSREI^VRGGDSPIjPBTGAEGEGAISFNFPQ 

ADDTVTLEE IDLPGRWKPKM I GGI GGF I KVRQYEE 1 1 IE I EGKKAIGTVLVGPTPVNI IGRNMLTQ IGCTXiNFP I SP IDTVPV 
KXiKPGMDGPKVKQWPIjTEEKI KALTE I CTEME KEGKI S KI ^ 
GIPHPAGLKKKKSVTVIiDVGDAYFSVPI^ESFRKYTAPT^ 
NPEIVIYQYlTODLYVGSDLEIGQHREKNTCEIiRiaililjKWGFTTP^^ 

I QKL VG KLtNW AS Q I YPG I KVKQL,CKL.LRGTKAIjTD I VPLT AE AELELAENRE I LKEPVHGVYYDP S KDL I AEVQKQGLDQWT 
YQIYQEPFKNIjKTGKYAKRRTAHTNDVRQIjAEVVQKISMESIVIWGKI pkfrlpiqretwetwwtdywqatwipewefvntp 
PLVKIjWYQLEKEPI IGAETFYVDGAANRETKIKSKAGYVTD^ 
LGIIQAQPDKSESELVSQIIEQLIKKEKVYLSWVPAHKGIGGNEQVDK^ 
DFl^PPIVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLD^ 

RWPVKVIHTDNGSNFTSAAVKAACWWANIQQEFGI PYNPQSQGWESMNKELKKI IGQVREQAEHL.KTAVQMAVFIHNFKRK 

GGIGGYSAGERITOIIATDLQTKELQKQITKIQNFRVYYRDSRDPI 

YGKQMAGDDCVAGRQDED $ 

2003_CON_X1_CPX pol.OPT - 

TTCTTCCGCGAGAACCTGGCCTTCCAGCAGGGCGAGGCCCGCGAGTTCTCCCCCGAGCAGGCCCGCGCCAACTCCCCCACCT 
CCCGCGAGCTGCGCGTGCGCGGCGGCGACTCCCCCCTGCCCGAGACCGGCGCCGAGGGCGAGGGCGCCATCTCCTTCAACTT 
CCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCATGAAGGTGGCCGGCCAGCT^ 

GCCGACGACACCGTGCTGGAGGAGATCGACCTGCCCGGCCGCTGGAAGCCGAAGATGATCGGCGGCATCGGCGGCTTCATCA 

AGGTGCGCCAGTACGAGGAGATCATCATCGAGATCGAGGGCAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCGT 

GAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGACACCGTGCCCGTG 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGATCT 

GCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAG 

GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGA 

TGGACGAGTCCTTCCGCAAGTACACCGCCTTCACCATCCCCTCCATCA^ 
CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACC^ 

AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCGAGAAGG 
TGGAGGAGCTGCGCAAGCACCTGCTGAAGTGGGGCTTC^CCACCCCCGACAAGAAG 

GATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGACAAGGAGTGCTGGACCGTGAACG 
ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCT 

GCGGCACCAAGGCCCTGACCGACATCGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
GGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGGTGCAGAAGCAGG^ 

TACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGCGCCGCACCGCCCACACCAACGACGTGC 

GCCAGCTGGCCGAGGTGGTGCAGAAGATCTCCATGGAGTCCATCX3TGATCTGGGGCAAGATCCCCAAGTTCCGCCTGC 

CCAGCGCGAGAC CTGGGAGACCTGGTGGACCGACTACTGGCAGGCC ACCTGG ATCCC CGAGTGGGAGTTCGTGAACACCC CC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACC 

GCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACAAGGGCCGCCAGAAGGTGGTGACCCTGACCGAGACCACCAA^ 

GAAGACCGAGCTGGAGGCCATCC^CCTGGCCCTGCAGGACTCCGGC^ 

CTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGTCCCAGATCATCGAGCAGCTGATCAAGAAGGAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCATCCG 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCX3CTACCACTCCAACTGGCGCGCCATGGCCTCC 

GACTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGC 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCGACCTGGAGGG 
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GGCCTCCGGCTACATCGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGC 
CX5CTGGCCCGTGAAGGTGATCCACACCGACAACGGCTCCAACTTCAC 

ACATCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAA 
GATCGGCCAGGTGCGCGAGCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCX3 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCGTGGACATCATCGCCACCGACCTGCAGACCAAGGAGCTGCAGAAGC 
AGATCACCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 
^•GAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGA 
TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

9 83. 2 003CON12BF pol.PSP 

** PFRENIiAFQQGEARKFPSEQARANSPASRELWVRRGDNPIjSEAGAERRGTVPSL^ 

GADDTVLEDII^PGKWKPKMIGGIGGFIKVKQYDNIIjIEICGHKAIGTVIjVGPTPVNIIGRNIjL 

VKLKPGMDGPKVXQWPLiTEEKI KAIjTE I CTEMEKEGKI S KI GPENPYNTPVFAI KKIODSTKWRKIjVDFRELNKRTQDFWEVQ 
LGI PHPAGLKKKKS VTVLDVGDAYFSVPLDKDFRKYTAFTI PSVNNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRK 
QNPDIVIYQYMDDLWGSDLEIGQHRTKIEELRQHIjLRWGFTTP 
DIQKLVGKLNWASQIYPGIKVKQIjCRIjIjRGTKAIiTK 

TYQ I YQEPFKNLKTGKYARMRGAHTNDVKQIjTEAVQKI TTE S I VI WGKTPKFRLP I LKETWDTWWTEYWQATW I PE WE FVNT 
PPLVKLWYQLETEPIAGAETFYVDGASNRETKKGKAGYVTDRGRQKAVSLTETTNQ 
AI^IIQAQPDKSESELWQIIEQLIKKEKVYLSWVPAHKGIGGNEQVD^ 
SDFNLPPWAKEIVASCDKCQLKGEAMHGQVDCSPGIWQI^ 

GRWPVTCTIHTDNGPNFSSAAVKAACWWAGIQQEFGI PYNPQSQGVVESMNKELKKI IRQVRDQAEHIjKTAVQMAVFIHNFKR 
KGGIGGYSAGERI IDI ISTDIQTRELQKQI IKIQNFRVYYRDSRDPVWKGPAKLLWKGEGAWIQDNSEIKVVPRRKAKI IR 
DYGKQMAGDDCVAGRQDED$ 



2003_CON_12_BF pol.OPT 

TTCTTCCGCGAGAACCrTGGCCTTCCAGCAGGGCGAGGCCCGCAAGTTCCCCTCCGAGCAGGCCCGCGCCAACTCCCCCGCCT 

CCCGCGAGCTGTGGGTGCGCCGCGGCGACAACCCCCTGTCCGAGGCCGGCGCCGAGCGCCGCGGCACCGTGCCCTCCCTGTC 

CTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCATCAAGGTGGGCGGCCAGCTGAAGGAGGCCCTGCTGGACACC 

GGCGCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCA 

TCAAGGTGAAGCAGTACGACAACATCCTGATCGAGATCTGCGGCCACAAGGCC^ 

CGTGAACATCATCGGCCGCS^CCTGCTGACCCAGCTGGGCTC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCGAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGAGA 

TCTGCACCGAGATGGAGAAGGAGGGCAAGATCTCCAAGATCGGCCCCGAGAACCCCTAC^ 

GAAGAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAAC^ 

CTGGGCATCCCCCACCCCGCCGGCCTOAAGAAGAAGAAGTCCGTGACCGTGC 

CCCK^ACyVAGGACrTCCGC^^GTACACaSCCTTCACCATCCCCTCCG 

C^CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGAC 

CAGAACCCCGACATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCTCCGACCTGGAGATCGGCCAGCACCGCACCA 

AGATCGAGGAGCTGCGCCAGCACCTGCTGCGCTGGGGCTTCACGACCCCCGAC^ 

GTGGATGGGCTACGAGCrTGCACCCCGACAAGTGGACCXS^ 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCC^ 

TGCGCGGCACC^^GGCCCTGACCGAGGTGATCCCCCTGACCAA 

GAAGGAGCCCGTGCACGGCGTGTACTACGACCCCTCCAAGGACCTGATCGCCGAGATCCAGAAGCAGGGCCAGGGCCAGTGG 
ACCTACCAGATCTACCAGGAGCCCTTCAAGAACCraAAGACCGGCAAGTACGCCCGCATGCGCGGCG^ 

TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCACCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCC. 

CATCCTGAAGGAGACCTGGGAC^CCTGGTGGACCGAGTACTGGCAGGCCACCTGGATCCC 

CCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGACCGAGCCC^^ 

ACCGCGAGACCAAGAAGGGCAAGGCCGGCTACGTGACCGACCGCGGCCGCCAGAAGGCCGTGTCCCTGACCGAGACCACCAA 
CCAGAAGGCCGAGCTGCACGCCATCCAGCTGGCCCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTAC 
GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGTCCGAGTCCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGG 
AGAAGGTGTACCTGTCCTGGGTGCCCGCCCAGAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCGCCGGCAT 
CCGCAAGATCCTGTTCOTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACAAC^ 

TCCGACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAGTGCCAGCTGAAGGGCGAGGCCATGC 
ACGGCCAGGTGGACrTOCTCCCCCGGC^TCTGGC^^ 

CGTGGCCTCCGGCTACCTGGAGGCCGAGGTGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCC 
GGCCGCTGGCCCGTGAAGACCATCGACACCGACAACGGCCCC^^CTTCTCCT 

CCGGCATCCAGCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAA 
GATCATCCGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGC 
AAGGGCGGCZATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCTCCACCGACATCCAGACCCGCGAGCTGCAGA 
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AGCAGATCATCAAGATCCAGAACTTCCGCGTGTACTACCGCGACTCCCGCGACCCCGTGTGGAAGGGCCCCGCCAAGCTGCT 
GTGGAAGGGCGAGGGCGCCGTGGTGATCCAGGACAACTCCGAGATCZAAGGTGGTGCCCCGCCGCiUVGGCCAAGATCATCCGC 
GACTACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 

.84. 200 3_CON_14_BG pol . PBP 

ffrenlafqqgearefspeqaransptrrelwvrrgdsplpea 

ADDT^EDINLPGKWKPKMIGGICKSFIK^ 
KLKPGMDGPKrcQWPLTEEKIKAI/TDICTEME^ 

GI pHPSGIiKKKKSVTVLDVGDAYFS VPI^ESFRKYTAFTI PSTNNETPGIRYQYNVliPCKSWKGS PAI FQSSMTKIIjEPFRIK 

NPEIVIYQY^DLYVGSDLEIGQHRAKIEELRKHLIiS^^ 

IQKX>VGKIjNWASQIYPGIK3/KQ1jCKXiIjRGAKALTDIVPLTAEAEL 

YQIYQEPYKNI1KTGKYAKRGSAHTNDVKQI1TEWQ 

PLVKLWYRLETEPIAGAETYYVDGAANRETKL^KAGY\ 

LGI IQAQPDRSESEVVNQI IEQIilKKEKVYLSWVPAHKGIGGNEQVDKLVSSGIRKVL 

DFNIjPP WAKE I VAS CDKCQIjKGEAMHGQVDCS PG I WQLDCTHIjEGKI I LVAVHVASGY I EAEV I PAETGQETAYF ILKIiAG 
RWPVKI IHTDNGSNFTSAAVKAACWWANITQEFGI PYNPQSQGWESMNTCEXiKKI IGQVRDQAEHliKTAVQMAVFIHNFKRK 
GGIGGYSAGERI IDIIASDIQTKELQKQITKIQNFRVYFRDSRDPIVnCGPAKLIjWKGEGAVVIQDNNEIKVVPRRKAKI IRD 
YGKQMAGDDCVAGRQDED $ 

2003 CON_JL4_BG pol. OPT 

TTCT^CCGCGAGAACCTGGCCTTCCAGGAGGGCGAGGCCCGCGAGTTCTCCCCCGAGCAGGCCCGCGCCAACTCCCCCACCC 
GCCGCGAGCTGTGGGTGCGCCGCGGCGACTCCCCCCTGCCCGAGGCCCGCGCCGAGGGCAAGGGCGACATCCCCCTGTCCCT 
GCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGACCGTGCGCATCGG 

GCCGACGACACCGTGCTGGAGGACATCAACCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA 

AGGTGCGGCAGTACGACCAGATCCTGATCGAGATCTGCGK3CAAGAAGGCCATCGGCACCGTGCTGGTGGGCCCCACCCCCAT 

CAACATCATCGGCCGCAACATGCTGACCCAGATCGGCTGCACCCTGAACTTCCCCATCTCCCCCATCGAGACCGTGCCCGTG 

AAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGACATCT 

GCACCGAGATGGAGCGCGAGGGCAAGATCTCCAAGATCX3GCCCCGAGAACCCCTACAACACCCCCATCTTCGCCATCAAGAA 

GAAGGACTCCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTGTGGGAGGTG 

GGC^TCCCCCACCCCTCCGGCCrTGAAGAAGAAGAAGTCC^ 

TGGACGAGTCCTTCCGCT^GTACACCGCCITC^ 

CGTGCTGCCCCAGGGCTGGAAGGGCTCCCCCGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCCGCATCAAG 
AACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACX3TGGGCTCCGACCTGGAGATCGGCCAGCACCGCGCCAAGA 
TCGAGGAGCTGCGCAAGCACCTGCTGTCC?TGGGGCTTCACCACCCCCGACAAG 

GATGGGCTACGAGCTGCA.CCCCGACAAGTGGACCGTGCAGCCCATCCAGCTGCCCGACAAGGAGTCCTGGACCGTGAACGAC 
ATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCTCCCAGATCTACCCCGGG 

GCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGCCGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGAA 
GGAGCCCGTGCACGGCGTGTACTACGAGCCCTCCAAGGAGCTGATCGC CGAGGTGCAGAAGCAGGGCCTGGACCAGTGGAC C 
TACCAGATCTACCAGGAGCCCTACAAGAACCTGAAGACCGGCAAGTACGCCAAGCGCGGCTCCGCCCACACCAACGACGTGA 
AGCAGCTGACCGAGGTGGTGCAGAAGATCGCCACCGAGTCCATCGTGATCTGGGGCAAGACCCCCAAGTTCAAGCTGC 
CCGCAAGGAGACCTGGGAGGTGTGGTGGACCGAGTACTGGCAGGCC^ 

CCCCfTOGTGAAGCTGTGGTACCGCCTGGAGACCGAGCCCATCGCCGGCGCCGAGACCTACTACGTGGACGGCGCCGCCAACC 
GCGAGACCAAGCTGGGCAAGGCCGGCTACGTGACCGACAAGGGCAAGCAGAAGATCATCACCCTGA^ 

GAAGGCCGAGCTGCAGGCCATCCAC ATCGC CCTGCAGGACTCCGGCTCCGAGGTGAACATCGTGACCGACTCCCAGTACGCC 

CTGGGCATCATCCAGGCCCAGCCCGACCGCTCCGAGTCCGAGGTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGA 

AGGTGTACCTGTCCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGGTGGACAAGCTGGTGTCCTCCGGCA 

CAAGGTGCTGTTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGAAGTACCACTCCAACTGGCGCGCCATGGCCTC 

GACTTCAACCTGCCCCCCGTGGTGGCCAAGGAGATCGTGGCCTCCTGCGACAAG 

GCCAGGTGGACTGCTCCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAGGGCAAGATCATCCTGGTGGCCGTGCACGT 

GGCCTCCGGCTACATCGAGGCCGAC^TGATCCCCGCCGAGACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGC 

CGCTGGCCCGTGAAGATCATCCACACCGACAACGGCTCCAACTTCACCTCCGCCGCCGTGAAGGCCGCCTGCTGGTGGGCCA 

ACATCACCCAGGAGTTCGGCATCCCCTACAACCCCCAGTCCCAGGGCGTGGTGGAGTCCATGAACAAGGAGCTGAAGAAGA^ 

CATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAACTTCAAGCGCAAG 

GGCGGCATCGGCGGCTACTCCGCCGGCGAGCGCATCATCGACATCATCGCCTCCGACATCCAGACCAAGGAGCTGCAGAAGC 

AGATCACCAAGATCCAGAACTTCCGC^TGTACTTCCGCGACTCCCGCGACCCCATCTGGAAGGGCCCCGCCAAGCTGCTGTG 

GAAGGGCGAGGGCGCCGTGGTGATCCAGGAGAACAACGAGATGAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCG 

TACGGCAAGCAGATGGCCGGCGACGACTGCGTGGCCGGCCGCCAGGACGAGGACTAA 
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